Here, the dataset is downloaded from https://www.kaggle.com/datasets/blastchar/telco-customer-churn?resource=download
This dataset consists of 7043 rows(records) and 21 columns(features). There are 20 independent features and 1 dependent feature("Churn"). The problem statement is, why the customers are moving out(Churn is nothing but leaving out of the business) of the business. We have to give solution to the stakeholder, to solve this problem and to provide a model which predicts customer churn.
import pandas as pd
import numpy as np
import tensorflow as tf
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
import plotly.express as plx
# reading the csv file from the current directory
df = pd.read_csv('WA_Fn-UseC_-Telco-Customer-Churn.csv')
# printing the df variable which holds churn dataset from IBM
df
| customerID | gender | SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | InternetService | OnlineSecurity | ... | DeviceProtection | TechSupport | StreamingTV | StreamingMovies | Contract | PaperlessBilling | PaymentMethod | MonthlyCharges | TotalCharges | Churn | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 7590-VHVEG | Female | 0 | Yes | No | 1 | No | No phone service | DSL | No | ... | No | No | No | No | Month-to-month | Yes | Electronic check | 29.85 | 29.85 | No |
| 1 | 5575-GNVDE | Male | 0 | No | No | 34 | Yes | No | DSL | Yes | ... | Yes | No | No | No | One year | No | Mailed check | 56.95 | 1889.5 | No |
| 2 | 3668-QPYBK | Male | 0 | No | No | 2 | Yes | No | DSL | Yes | ... | No | No | No | No | Month-to-month | Yes | Mailed check | 53.85 | 108.15 | Yes |
| 3 | 7795-CFOCW | Male | 0 | No | No | 45 | No | No phone service | DSL | Yes | ... | Yes | Yes | No | No | One year | No | Bank transfer (automatic) | 42.30 | 1840.75 | No |
| 4 | 9237-HQITU | Female | 0 | No | No | 2 | Yes | No | Fiber optic | No | ... | No | No | No | No | Month-to-month | Yes | Electronic check | 70.70 | 151.65 | Yes |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 7038 | 6840-RESVB | Male | 0 | Yes | Yes | 24 | Yes | Yes | DSL | Yes | ... | Yes | Yes | Yes | Yes | One year | Yes | Mailed check | 84.80 | 1990.5 | No |
| 7039 | 2234-XADUH | Female | 0 | Yes | Yes | 72 | Yes | Yes | Fiber optic | No | ... | Yes | No | Yes | Yes | One year | Yes | Credit card (automatic) | 103.20 | 7362.9 | No |
| 7040 | 4801-JZAZL | Female | 0 | Yes | Yes | 11 | No | No phone service | DSL | Yes | ... | No | No | No | No | Month-to-month | Yes | Electronic check | 29.60 | 346.45 | No |
| 7041 | 8361-LTMKD | Male | 1 | Yes | No | 4 | Yes | Yes | Fiber optic | No | ... | No | No | No | No | Month-to-month | Yes | Mailed check | 74.40 | 306.6 | Yes |
| 7042 | 3186-AJIEK | Male | 0 | No | No | 66 | Yes | No | Fiber optic | Yes | ... | Yes | Yes | Yes | Yes | Two year | Yes | Bank transfer (automatic) | 105.65 | 6844.5 | No |
7043 rows × 21 columns
# describe gives a small statistical summary about the dataset(df)
df.describe()
| SeniorCitizen | tenure | MonthlyCharges | |
|---|---|---|---|
| count | 7043.000000 | 7043.000000 | 7043.000000 |
| mean | 0.162147 | 32.371149 | 64.761692 |
| std | 0.368612 | 24.559481 | 30.090047 |
| min | 0.000000 | 0.000000 | 18.250000 |
| 25% | 0.000000 | 9.000000 | 35.500000 |
| 50% | 0.000000 | 29.000000 | 70.350000 |
| 75% | 0.000000 | 55.000000 | 89.850000 |
| max | 1.000000 | 72.000000 | 118.750000 |
# dtypes gives the datatypes of the individual features in the dataframe
df.dtypes
customerID object gender object SeniorCitizen int64 Partner object Dependents object tenure int64 PhoneService object MultipleLines object InternetService object OnlineSecurity object OnlineBackup object DeviceProtection object TechSupport object StreamingTV object StreamingMovies object Contract object PaperlessBilling object PaymentMethod object MonthlyCharges float64 TotalCharges object Churn object dtype: object
df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 7043 entries, 0 to 7042 Data columns (total 21 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 customerID 7043 non-null object 1 gender 7043 non-null object 2 SeniorCitizen 7043 non-null int64 3 Partner 7043 non-null object 4 Dependents 7043 non-null object 5 tenure 7043 non-null int64 6 PhoneService 7043 non-null object 7 MultipleLines 7043 non-null object 8 InternetService 7043 non-null object 9 OnlineSecurity 7043 non-null object 10 OnlineBackup 7043 non-null object 11 DeviceProtection 7043 non-null object 12 TechSupport 7043 non-null object 13 StreamingTV 7043 non-null object 14 StreamingMovies 7043 non-null object 15 Contract 7043 non-null object 16 PaperlessBilling 7043 non-null object 17 PaymentMethod 7043 non-null object 18 MonthlyCharges 7043 non-null float64 19 TotalCharges 7043 non-null object 20 Churn 7043 non-null object dtypes: float64(1), int64(2), object(18) memory usage: 1.1+ MB
df = df.drop(['customerID'], axis = 1)
df
| gender | SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | InternetService | OnlineSecurity | OnlineBackup | DeviceProtection | TechSupport | StreamingTV | StreamingMovies | Contract | PaperlessBilling | PaymentMethod | MonthlyCharges | TotalCharges | Churn | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Female | 0 | Yes | No | 1 | No | No phone service | DSL | No | Yes | No | No | No | No | Month-to-month | Yes | Electronic check | 29.85 | 29.85 | No |
| 1 | Male | 0 | No | No | 34 | Yes | No | DSL | Yes | No | Yes | No | No | No | One year | No | Mailed check | 56.95 | 1889.5 | No |
| 2 | Male | 0 | No | No | 2 | Yes | No | DSL | Yes | Yes | No | No | No | No | Month-to-month | Yes | Mailed check | 53.85 | 108.15 | Yes |
| 3 | Male | 0 | No | No | 45 | No | No phone service | DSL | Yes | No | Yes | Yes | No | No | One year | No | Bank transfer (automatic) | 42.30 | 1840.75 | No |
| 4 | Female | 0 | No | No | 2 | Yes | No | Fiber optic | No | No | No | No | No | No | Month-to-month | Yes | Electronic check | 70.70 | 151.65 | Yes |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 7038 | Male | 0 | Yes | Yes | 24 | Yes | Yes | DSL | Yes | No | Yes | Yes | Yes | Yes | One year | Yes | Mailed check | 84.80 | 1990.5 | No |
| 7039 | Female | 0 | Yes | Yes | 72 | Yes | Yes | Fiber optic | No | Yes | Yes | No | Yes | Yes | One year | Yes | Credit card (automatic) | 103.20 | 7362.9 | No |
| 7040 | Female | 0 | Yes | Yes | 11 | No | No phone service | DSL | Yes | No | No | No | No | No | Month-to-month | Yes | Electronic check | 29.60 | 346.45 | No |
| 7041 | Male | 1 | Yes | No | 4 | Yes | Yes | Fiber optic | No | No | No | No | No | No | Month-to-month | Yes | Mailed check | 74.40 | 306.6 | Yes |
| 7042 | Male | 0 | No | No | 66 | Yes | No | Fiber optic | Yes | No | Yes | Yes | Yes | Yes | Two year | Yes | Bank transfer (automatic) | 105.65 | 6844.5 | No |
7043 rows × 20 columns
df['gender'].isnull().sum()
0
There is no null value in the 'gender' feature.
df['gender'].unique()
array(['Female', 'Male'], dtype=object)
There is a two unique values in the 'gender' feature(1. Female, 2. Male). Let's do one hot encoding for this feature using get_dummies in pandas.
gender = pd.get_dummies(df['gender'], prefix = 'gender', dtype = 'int', drop_first = True)
gender
| gender_Male | |
|---|---|
| 0 | 0 |
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 0 |
| ... | ... |
| 7038 | 1 |
| 7039 | 0 |
| 7040 | 0 |
| 7041 | 1 |
| 7042 | 1 |
7043 rows × 1 columns
Removing the gender column from original dataframe.
df = df.drop(['gender'], axis = 1)
df
| SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | InternetService | OnlineSecurity | OnlineBackup | DeviceProtection | TechSupport | StreamingTV | StreamingMovies | Contract | PaperlessBilling | PaymentMethod | MonthlyCharges | TotalCharges | Churn | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | Yes | No | 1 | No | No phone service | DSL | No | Yes | No | No | No | No | Month-to-month | Yes | Electronic check | 29.85 | 29.85 | No |
| 1 | 0 | No | No | 34 | Yes | No | DSL | Yes | No | Yes | No | No | No | One year | No | Mailed check | 56.95 | 1889.5 | No |
| 2 | 0 | No | No | 2 | Yes | No | DSL | Yes | Yes | No | No | No | No | Month-to-month | Yes | Mailed check | 53.85 | 108.15 | Yes |
| 3 | 0 | No | No | 45 | No | No phone service | DSL | Yes | No | Yes | Yes | No | No | One year | No | Bank transfer (automatic) | 42.30 | 1840.75 | No |
| 4 | 0 | No | No | 2 | Yes | No | Fiber optic | No | No | No | No | No | No | Month-to-month | Yes | Electronic check | 70.70 | 151.65 | Yes |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 7038 | 0 | Yes | Yes | 24 | Yes | Yes | DSL | Yes | No | Yes | Yes | Yes | Yes | One year | Yes | Mailed check | 84.80 | 1990.5 | No |
| 7039 | 0 | Yes | Yes | 72 | Yes | Yes | Fiber optic | No | Yes | Yes | No | Yes | Yes | One year | Yes | Credit card (automatic) | 103.20 | 7362.9 | No |
| 7040 | 0 | Yes | Yes | 11 | No | No phone service | DSL | Yes | No | No | No | No | No | Month-to-month | Yes | Electronic check | 29.60 | 346.45 | No |
| 7041 | 1 | Yes | No | 4 | Yes | Yes | Fiber optic | No | No | No | No | No | No | Month-to-month | Yes | Mailed check | 74.40 | 306.6 | Yes |
| 7042 | 0 | No | No | 66 | Yes | No | Fiber optic | Yes | No | Yes | Yes | Yes | Yes | Two year | Yes | Bank transfer (automatic) | 105.65 | 6844.5 | No |
7043 rows × 19 columns
Now, we use concat from pandas to join 'df' and 'gender'.
df = pd.concat((df, gender), axis = 1)
df
| SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | InternetService | OnlineSecurity | OnlineBackup | DeviceProtection | TechSupport | StreamingTV | StreamingMovies | Contract | PaperlessBilling | PaymentMethod | MonthlyCharges | TotalCharges | Churn | gender_Male | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | Yes | No | 1 | No | No phone service | DSL | No | Yes | No | No | No | No | Month-to-month | Yes | Electronic check | 29.85 | 29.85 | No | 0 |
| 1 | 0 | No | No | 34 | Yes | No | DSL | Yes | No | Yes | No | No | No | One year | No | Mailed check | 56.95 | 1889.5 | No | 1 |
| 2 | 0 | No | No | 2 | Yes | No | DSL | Yes | Yes | No | No | No | No | Month-to-month | Yes | Mailed check | 53.85 | 108.15 | Yes | 1 |
| 3 | 0 | No | No | 45 | No | No phone service | DSL | Yes | No | Yes | Yes | No | No | One year | No | Bank transfer (automatic) | 42.30 | 1840.75 | No | 1 |
| 4 | 0 | No | No | 2 | Yes | No | Fiber optic | No | No | No | No | No | No | Month-to-month | Yes | Electronic check | 70.70 | 151.65 | Yes | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 7038 | 0 | Yes | Yes | 24 | Yes | Yes | DSL | Yes | No | Yes | Yes | Yes | Yes | One year | Yes | Mailed check | 84.80 | 1990.5 | No | 1 |
| 7039 | 0 | Yes | Yes | 72 | Yes | Yes | Fiber optic | No | Yes | Yes | No | Yes | Yes | One year | Yes | Credit card (automatic) | 103.20 | 7362.9 | No | 0 |
| 7040 | 0 | Yes | Yes | 11 | No | No phone service | DSL | Yes | No | No | No | No | No | Month-to-month | Yes | Electronic check | 29.60 | 346.45 | No | 0 |
| 7041 | 1 | Yes | No | 4 | Yes | Yes | Fiber optic | No | No | No | No | No | No | Month-to-month | Yes | Mailed check | 74.40 | 306.6 | Yes | 1 |
| 7042 | 0 | No | No | 66 | Yes | No | Fiber optic | Yes | No | Yes | Yes | Yes | Yes | Two year | Yes | Bank transfer (automatic) | 105.65 | 6844.5 | No | 1 |
7043 rows × 20 columns
df['Partner'].isnull().sum()
0
df['Partner'].unique()
array(['Yes', 'No'], dtype=object)
There are only two unique values('Yes' and 'No'). Let's do a label encoding. It means that for 'Yes', we have replace it with 1 and for 'No' we have to replace it with 0.
le = preprocessing.LabelEncoder()
df['Partner'] = le.fit_transform(df['Partner'])
df
| SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | InternetService | OnlineSecurity | OnlineBackup | DeviceProtection | TechSupport | StreamingTV | StreamingMovies | Contract | PaperlessBilling | PaymentMethod | MonthlyCharges | TotalCharges | Churn | gender_Male | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | No | 1 | No | No phone service | DSL | No | Yes | No | No | No | No | Month-to-month | Yes | Electronic check | 29.85 | 29.85 | No | 0 |
| 1 | 0 | 0 | No | 34 | Yes | No | DSL | Yes | No | Yes | No | No | No | One year | No | Mailed check | 56.95 | 1889.5 | No | 1 |
| 2 | 0 | 0 | No | 2 | Yes | No | DSL | Yes | Yes | No | No | No | No | Month-to-month | Yes | Mailed check | 53.85 | 108.15 | Yes | 1 |
| 3 | 0 | 0 | No | 45 | No | No phone service | DSL | Yes | No | Yes | Yes | No | No | One year | No | Bank transfer (automatic) | 42.30 | 1840.75 | No | 1 |
| 4 | 0 | 0 | No | 2 | Yes | No | Fiber optic | No | No | No | No | No | No | Month-to-month | Yes | Electronic check | 70.70 | 151.65 | Yes | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 7038 | 0 | 1 | Yes | 24 | Yes | Yes | DSL | Yes | No | Yes | Yes | Yes | Yes | One year | Yes | Mailed check | 84.80 | 1990.5 | No | 1 |
| 7039 | 0 | 1 | Yes | 72 | Yes | Yes | Fiber optic | No | Yes | Yes | No | Yes | Yes | One year | Yes | Credit card (automatic) | 103.20 | 7362.9 | No | 0 |
| 7040 | 0 | 1 | Yes | 11 | No | No phone service | DSL | Yes | No | No | No | No | No | Month-to-month | Yes | Electronic check | 29.60 | 346.45 | No | 0 |
| 7041 | 1 | 1 | No | 4 | Yes | Yes | Fiber optic | No | No | No | No | No | No | Month-to-month | Yes | Mailed check | 74.40 | 306.6 | Yes | 1 |
| 7042 | 0 | 0 | No | 66 | Yes | No | Fiber optic | Yes | No | Yes | Yes | Yes | Yes | Two year | Yes | Bank transfer (automatic) | 105.65 | 6844.5 | No | 1 |
7043 rows × 20 columns
df['Dependents'].isnull().sum()
0
df['Dependents'].unique()
array(['No', 'Yes'], dtype=object)
There are only two unique values('Yes' and 'No'). Let's do a label encoding. It means that for 'Yes', we have replace it with 1 and for 'No' we have to replace it with 0.
df['Dependents'] = le.fit_transform(df['Dependents'])
df
| SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | InternetService | OnlineSecurity | OnlineBackup | DeviceProtection | TechSupport | StreamingTV | StreamingMovies | Contract | PaperlessBilling | PaymentMethod | MonthlyCharges | TotalCharges | Churn | gender_Male | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 0 | 1 | No | No phone service | DSL | No | Yes | No | No | No | No | Month-to-month | Yes | Electronic check | 29.85 | 29.85 | No | 0 |
| 1 | 0 | 0 | 0 | 34 | Yes | No | DSL | Yes | No | Yes | No | No | No | One year | No | Mailed check | 56.95 | 1889.5 | No | 1 |
| 2 | 0 | 0 | 0 | 2 | Yes | No | DSL | Yes | Yes | No | No | No | No | Month-to-month | Yes | Mailed check | 53.85 | 108.15 | Yes | 1 |
| 3 | 0 | 0 | 0 | 45 | No | No phone service | DSL | Yes | No | Yes | Yes | No | No | One year | No | Bank transfer (automatic) | 42.30 | 1840.75 | No | 1 |
| 4 | 0 | 0 | 0 | 2 | Yes | No | Fiber optic | No | No | No | No | No | No | Month-to-month | Yes | Electronic check | 70.70 | 151.65 | Yes | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 7038 | 0 | 1 | 1 | 24 | Yes | Yes | DSL | Yes | No | Yes | Yes | Yes | Yes | One year | Yes | Mailed check | 84.80 | 1990.5 | No | 1 |
| 7039 | 0 | 1 | 1 | 72 | Yes | Yes | Fiber optic | No | Yes | Yes | No | Yes | Yes | One year | Yes | Credit card (automatic) | 103.20 | 7362.9 | No | 0 |
| 7040 | 0 | 1 | 1 | 11 | No | No phone service | DSL | Yes | No | No | No | No | No | Month-to-month | Yes | Electronic check | 29.60 | 346.45 | No | 0 |
| 7041 | 1 | 1 | 0 | 4 | Yes | Yes | Fiber optic | No | No | No | No | No | No | Month-to-month | Yes | Mailed check | 74.40 | 306.6 | Yes | 1 |
| 7042 | 0 | 0 | 0 | 66 | Yes | No | Fiber optic | Yes | No | Yes | Yes | Yes | Yes | Two year | Yes | Bank transfer (automatic) | 105.65 | 6844.5 | No | 1 |
7043 rows × 20 columns
df['PhoneService'].isnull().sum()
0
df['PhoneService'].unique()
array(['No', 'Yes'], dtype=object)
There are only two unique values('Yes' and 'No'). Let's do a label encoding. It means that for 'Yes', we have replace it with 1 and for 'No' we have to replace it with 0.
df['PhoneService'] = le.fit_transform(df['PhoneService'])
df
| SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | InternetService | OnlineSecurity | OnlineBackup | DeviceProtection | TechSupport | StreamingTV | StreamingMovies | Contract | PaperlessBilling | PaymentMethod | MonthlyCharges | TotalCharges | Churn | gender_Male | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 0 | 1 | 0 | No phone service | DSL | No | Yes | No | No | No | No | Month-to-month | Yes | Electronic check | 29.85 | 29.85 | No | 0 |
| 1 | 0 | 0 | 0 | 34 | 1 | No | DSL | Yes | No | Yes | No | No | No | One year | No | Mailed check | 56.95 | 1889.5 | No | 1 |
| 2 | 0 | 0 | 0 | 2 | 1 | No | DSL | Yes | Yes | No | No | No | No | Month-to-month | Yes | Mailed check | 53.85 | 108.15 | Yes | 1 |
| 3 | 0 | 0 | 0 | 45 | 0 | No phone service | DSL | Yes | No | Yes | Yes | No | No | One year | No | Bank transfer (automatic) | 42.30 | 1840.75 | No | 1 |
| 4 | 0 | 0 | 0 | 2 | 1 | No | Fiber optic | No | No | No | No | No | No | Month-to-month | Yes | Electronic check | 70.70 | 151.65 | Yes | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 7038 | 0 | 1 | 1 | 24 | 1 | Yes | DSL | Yes | No | Yes | Yes | Yes | Yes | One year | Yes | Mailed check | 84.80 | 1990.5 | No | 1 |
| 7039 | 0 | 1 | 1 | 72 | 1 | Yes | Fiber optic | No | Yes | Yes | No | Yes | Yes | One year | Yes | Credit card (automatic) | 103.20 | 7362.9 | No | 0 |
| 7040 | 0 | 1 | 1 | 11 | 0 | No phone service | DSL | Yes | No | No | No | No | No | Month-to-month | Yes | Electronic check | 29.60 | 346.45 | No | 0 |
| 7041 | 1 | 1 | 0 | 4 | 1 | Yes | Fiber optic | No | No | No | No | No | No | Month-to-month | Yes | Mailed check | 74.40 | 306.6 | Yes | 1 |
| 7042 | 0 | 0 | 0 | 66 | 1 | No | Fiber optic | Yes | No | Yes | Yes | Yes | Yes | Two year | Yes | Bank transfer (automatic) | 105.65 | 6844.5 | No | 1 |
7043 rows × 20 columns
df['MultipleLines'].isnull().sum()
0
df['MultipleLines'].unique()
array(['No phone service', 'No', 'Yes'], dtype=object)
df.loc[df['MultipleLines'] == "No phone service", "MultipleLines"] = "No"
df
| SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | InternetService | OnlineSecurity | OnlineBackup | DeviceProtection | TechSupport | StreamingTV | StreamingMovies | Contract | PaperlessBilling | PaymentMethod | MonthlyCharges | TotalCharges | Churn | gender_Male | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 0 | 1 | 0 | No | DSL | No | Yes | No | No | No | No | Month-to-month | Yes | Electronic check | 29.85 | 29.85 | No | 0 |
| 1 | 0 | 0 | 0 | 34 | 1 | No | DSL | Yes | No | Yes | No | No | No | One year | No | Mailed check | 56.95 | 1889.5 | No | 1 |
| 2 | 0 | 0 | 0 | 2 | 1 | No | DSL | Yes | Yes | No | No | No | No | Month-to-month | Yes | Mailed check | 53.85 | 108.15 | Yes | 1 |
| 3 | 0 | 0 | 0 | 45 | 0 | No | DSL | Yes | No | Yes | Yes | No | No | One year | No | Bank transfer (automatic) | 42.30 | 1840.75 | No | 1 |
| 4 | 0 | 0 | 0 | 2 | 1 | No | Fiber optic | No | No | No | No | No | No | Month-to-month | Yes | Electronic check | 70.70 | 151.65 | Yes | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 7038 | 0 | 1 | 1 | 24 | 1 | Yes | DSL | Yes | No | Yes | Yes | Yes | Yes | One year | Yes | Mailed check | 84.80 | 1990.5 | No | 1 |
| 7039 | 0 | 1 | 1 | 72 | 1 | Yes | Fiber optic | No | Yes | Yes | No | Yes | Yes | One year | Yes | Credit card (automatic) | 103.20 | 7362.9 | No | 0 |
| 7040 | 0 | 1 | 1 | 11 | 0 | No | DSL | Yes | No | No | No | No | No | Month-to-month | Yes | Electronic check | 29.60 | 346.45 | No | 0 |
| 7041 | 1 | 1 | 0 | 4 | 1 | Yes | Fiber optic | No | No | No | No | No | No | Month-to-month | Yes | Mailed check | 74.40 | 306.6 | Yes | 1 |
| 7042 | 0 | 0 | 0 | 66 | 1 | No | Fiber optic | Yes | No | Yes | Yes | Yes | Yes | Two year | Yes | Bank transfer (automatic) | 105.65 | 6844.5 | No | 1 |
7043 rows × 20 columns
df['MultipleLines'].unique()
array(['No', 'Yes'], dtype=object)
We change that 'No phone service' to 'No'. Now, we can perform Label Encoding using the object 'le'.
df['MultipleLines'] = le.fit_transform(df['MultipleLines'])
df
| SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | InternetService | OnlineSecurity | OnlineBackup | DeviceProtection | TechSupport | StreamingTV | StreamingMovies | Contract | PaperlessBilling | PaymentMethod | MonthlyCharges | TotalCharges | Churn | gender_Male | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 0 | 1 | 0 | 0 | DSL | No | Yes | No | No | No | No | Month-to-month | Yes | Electronic check | 29.85 | 29.85 | No | 0 |
| 1 | 0 | 0 | 0 | 34 | 1 | 0 | DSL | Yes | No | Yes | No | No | No | One year | No | Mailed check | 56.95 | 1889.5 | No | 1 |
| 2 | 0 | 0 | 0 | 2 | 1 | 0 | DSL | Yes | Yes | No | No | No | No | Month-to-month | Yes | Mailed check | 53.85 | 108.15 | Yes | 1 |
| 3 | 0 | 0 | 0 | 45 | 0 | 0 | DSL | Yes | No | Yes | Yes | No | No | One year | No | Bank transfer (automatic) | 42.30 | 1840.75 | No | 1 |
| 4 | 0 | 0 | 0 | 2 | 1 | 0 | Fiber optic | No | No | No | No | No | No | Month-to-month | Yes | Electronic check | 70.70 | 151.65 | Yes | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 7038 | 0 | 1 | 1 | 24 | 1 | 1 | DSL | Yes | No | Yes | Yes | Yes | Yes | One year | Yes | Mailed check | 84.80 | 1990.5 | No | 1 |
| 7039 | 0 | 1 | 1 | 72 | 1 | 1 | Fiber optic | No | Yes | Yes | No | Yes | Yes | One year | Yes | Credit card (automatic) | 103.20 | 7362.9 | No | 0 |
| 7040 | 0 | 1 | 1 | 11 | 0 | 0 | DSL | Yes | No | No | No | No | No | Month-to-month | Yes | Electronic check | 29.60 | 346.45 | No | 0 |
| 7041 | 1 | 1 | 0 | 4 | 1 | 1 | Fiber optic | No | No | No | No | No | No | Month-to-month | Yes | Mailed check | 74.40 | 306.6 | Yes | 1 |
| 7042 | 0 | 0 | 0 | 66 | 1 | 0 | Fiber optic | Yes | No | Yes | Yes | Yes | Yes | Two year | Yes | Bank transfer (automatic) | 105.65 | 6844.5 | No | 1 |
7043 rows × 20 columns
df['InternetService'].isnull().sum()
0
df['InternetService'].unique()
array(['DSL', 'Fiber optic', 'No'], dtype=object)
Let's perform one-hot encoding for the feature 'InternetService'
internet_service = pd.get_dummies(df['InternetService'], prefix = 'InternetService', dtype = 'int')
Dropping the feature 'InternetService' and including encoded features from 'internet_service'
df = df.drop(['InternetService'], axis = 1)
df = pd.concat((df, internet_service), axis = 1)
df
| SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | OnlineSecurity | OnlineBackup | DeviceProtection | TechSupport | ... | Contract | PaperlessBilling | PaymentMethod | MonthlyCharges | TotalCharges | Churn | gender_Male | InternetService_DSL | InternetService_Fiber optic | InternetService_No | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 0 | 1 | 0 | 0 | No | Yes | No | No | ... | Month-to-month | Yes | Electronic check | 29.85 | 29.85 | No | 0 | 1 | 0 | 0 |
| 1 | 0 | 0 | 0 | 34 | 1 | 0 | Yes | No | Yes | No | ... | One year | No | Mailed check | 56.95 | 1889.5 | No | 1 | 1 | 0 | 0 |
| 2 | 0 | 0 | 0 | 2 | 1 | 0 | Yes | Yes | No | No | ... | Month-to-month | Yes | Mailed check | 53.85 | 108.15 | Yes | 1 | 1 | 0 | 0 |
| 3 | 0 | 0 | 0 | 45 | 0 | 0 | Yes | No | Yes | Yes | ... | One year | No | Bank transfer (automatic) | 42.30 | 1840.75 | No | 1 | 1 | 0 | 0 |
| 4 | 0 | 0 | 0 | 2 | 1 | 0 | No | No | No | No | ... | Month-to-month | Yes | Electronic check | 70.70 | 151.65 | Yes | 0 | 0 | 1 | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 7038 | 0 | 1 | 1 | 24 | 1 | 1 | Yes | No | Yes | Yes | ... | One year | Yes | Mailed check | 84.80 | 1990.5 | No | 1 | 1 | 0 | 0 |
| 7039 | 0 | 1 | 1 | 72 | 1 | 1 | No | Yes | Yes | No | ... | One year | Yes | Credit card (automatic) | 103.20 | 7362.9 | No | 0 | 0 | 1 | 0 |
| 7040 | 0 | 1 | 1 | 11 | 0 | 0 | Yes | No | No | No | ... | Month-to-month | Yes | Electronic check | 29.60 | 346.45 | No | 0 | 1 | 0 | 0 |
| 7041 | 1 | 1 | 0 | 4 | 1 | 1 | No | No | No | No | ... | Month-to-month | Yes | Mailed check | 74.40 | 306.6 | Yes | 1 | 0 | 1 | 0 |
| 7042 | 0 | 0 | 0 | 66 | 1 | 0 | Yes | No | Yes | Yes | ... | Two year | Yes | Bank transfer (automatic) | 105.65 | 6844.5 | No | 1 | 0 | 1 | 0 |
7043 rows × 22 columns
df['OnlineSecurity'].isnull().sum()
0
df['OnlineSecurity'].unique()
array(['No', 'Yes', 'No internet service'], dtype=object)
df.loc[df['OnlineSecurity'] == "No internet service", "OnlineSecurity"] = "No"
We change that 'No internet service' to 'No'. Now, we can perform Label Encoding using the object 'le'.
df['OnlineSecurity'].unique()
array(['No', 'Yes'], dtype=object)
df
| SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | OnlineSecurity | OnlineBackup | DeviceProtection | TechSupport | ... | Contract | PaperlessBilling | PaymentMethod | MonthlyCharges | TotalCharges | Churn | gender_Male | InternetService_DSL | InternetService_Fiber optic | InternetService_No | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 0 | 1 | 0 | 0 | No | Yes | No | No | ... | Month-to-month | Yes | Electronic check | 29.85 | 29.85 | No | 0 | 1 | 0 | 0 |
| 1 | 0 | 0 | 0 | 34 | 1 | 0 | Yes | No | Yes | No | ... | One year | No | Mailed check | 56.95 | 1889.5 | No | 1 | 1 | 0 | 0 |
| 2 | 0 | 0 | 0 | 2 | 1 | 0 | Yes | Yes | No | No | ... | Month-to-month | Yes | Mailed check | 53.85 | 108.15 | Yes | 1 | 1 | 0 | 0 |
| 3 | 0 | 0 | 0 | 45 | 0 | 0 | Yes | No | Yes | Yes | ... | One year | No | Bank transfer (automatic) | 42.30 | 1840.75 | No | 1 | 1 | 0 | 0 |
| 4 | 0 | 0 | 0 | 2 | 1 | 0 | No | No | No | No | ... | Month-to-month | Yes | Electronic check | 70.70 | 151.65 | Yes | 0 | 0 | 1 | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 7038 | 0 | 1 | 1 | 24 | 1 | 1 | Yes | No | Yes | Yes | ... | One year | Yes | Mailed check | 84.80 | 1990.5 | No | 1 | 1 | 0 | 0 |
| 7039 | 0 | 1 | 1 | 72 | 1 | 1 | No | Yes | Yes | No | ... | One year | Yes | Credit card (automatic) | 103.20 | 7362.9 | No | 0 | 0 | 1 | 0 |
| 7040 | 0 | 1 | 1 | 11 | 0 | 0 | Yes | No | No | No | ... | Month-to-month | Yes | Electronic check | 29.60 | 346.45 | No | 0 | 1 | 0 | 0 |
| 7041 | 1 | 1 | 0 | 4 | 1 | 1 | No | No | No | No | ... | Month-to-month | Yes | Mailed check | 74.40 | 306.6 | Yes | 1 | 0 | 1 | 0 |
| 7042 | 0 | 0 | 0 | 66 | 1 | 0 | Yes | No | Yes | Yes | ... | Two year | Yes | Bank transfer (automatic) | 105.65 | 6844.5 | No | 1 | 0 | 1 | 0 |
7043 rows × 22 columns
df['OnlineSecurity'] = le.fit_transform(df['OnlineSecurity'])
df
| SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | OnlineSecurity | OnlineBackup | DeviceProtection | TechSupport | ... | Contract | PaperlessBilling | PaymentMethod | MonthlyCharges | TotalCharges | Churn | gender_Male | InternetService_DSL | InternetService_Fiber optic | InternetService_No | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | Yes | No | No | ... | Month-to-month | Yes | Electronic check | 29.85 | 29.85 | No | 0 | 1 | 0 | 0 |
| 1 | 0 | 0 | 0 | 34 | 1 | 0 | 1 | No | Yes | No | ... | One year | No | Mailed check | 56.95 | 1889.5 | No | 1 | 1 | 0 | 0 |
| 2 | 0 | 0 | 0 | 2 | 1 | 0 | 1 | Yes | No | No | ... | Month-to-month | Yes | Mailed check | 53.85 | 108.15 | Yes | 1 | 1 | 0 | 0 |
| 3 | 0 | 0 | 0 | 45 | 0 | 0 | 1 | No | Yes | Yes | ... | One year | No | Bank transfer (automatic) | 42.30 | 1840.75 | No | 1 | 1 | 0 | 0 |
| 4 | 0 | 0 | 0 | 2 | 1 | 0 | 0 | No | No | No | ... | Month-to-month | Yes | Electronic check | 70.70 | 151.65 | Yes | 0 | 0 | 1 | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 7038 | 0 | 1 | 1 | 24 | 1 | 1 | 1 | No | Yes | Yes | ... | One year | Yes | Mailed check | 84.80 | 1990.5 | No | 1 | 1 | 0 | 0 |
| 7039 | 0 | 1 | 1 | 72 | 1 | 1 | 0 | Yes | Yes | No | ... | One year | Yes | Credit card (automatic) | 103.20 | 7362.9 | No | 0 | 0 | 1 | 0 |
| 7040 | 0 | 1 | 1 | 11 | 0 | 0 | 1 | No | No | No | ... | Month-to-month | Yes | Electronic check | 29.60 | 346.45 | No | 0 | 1 | 0 | 0 |
| 7041 | 1 | 1 | 0 | 4 | 1 | 1 | 0 | No | No | No | ... | Month-to-month | Yes | Mailed check | 74.40 | 306.6 | Yes | 1 | 0 | 1 | 0 |
| 7042 | 0 | 0 | 0 | 66 | 1 | 0 | 1 | No | Yes | Yes | ... | Two year | Yes | Bank transfer (automatic) | 105.65 | 6844.5 | No | 1 | 0 | 1 | 0 |
7043 rows × 22 columns
df['OnlineBackup'].isnull().sum()
0
df['OnlineBackup'].unique()
array(['Yes', 'No', 'No internet service'], dtype=object)
df.loc[df['OnlineBackup'] == "No internet service", "OnlineBackup"] = "No"
df['OnlineBackup'].unique()
array(['Yes', 'No'], dtype=object)
df['OnlineBackup'] = le.fit_transform(df['OnlineBackup'])
df
| SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | OnlineSecurity | OnlineBackup | DeviceProtection | TechSupport | ... | Contract | PaperlessBilling | PaymentMethod | MonthlyCharges | TotalCharges | Churn | gender_Male | InternetService_DSL | InternetService_Fiber optic | InternetService_No | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | No | No | ... | Month-to-month | Yes | Electronic check | 29.85 | 29.85 | No | 0 | 1 | 0 | 0 |
| 1 | 0 | 0 | 0 | 34 | 1 | 0 | 1 | 0 | Yes | No | ... | One year | No | Mailed check | 56.95 | 1889.5 | No | 1 | 1 | 0 | 0 |
| 2 | 0 | 0 | 0 | 2 | 1 | 0 | 1 | 1 | No | No | ... | Month-to-month | Yes | Mailed check | 53.85 | 108.15 | Yes | 1 | 1 | 0 | 0 |
| 3 | 0 | 0 | 0 | 45 | 0 | 0 | 1 | 0 | Yes | Yes | ... | One year | No | Bank transfer (automatic) | 42.30 | 1840.75 | No | 1 | 1 | 0 | 0 |
| 4 | 0 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | No | No | ... | Month-to-month | Yes | Electronic check | 70.70 | 151.65 | Yes | 0 | 0 | 1 | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 7038 | 0 | 1 | 1 | 24 | 1 | 1 | 1 | 0 | Yes | Yes | ... | One year | Yes | Mailed check | 84.80 | 1990.5 | No | 1 | 1 | 0 | 0 |
| 7039 | 0 | 1 | 1 | 72 | 1 | 1 | 0 | 1 | Yes | No | ... | One year | Yes | Credit card (automatic) | 103.20 | 7362.9 | No | 0 | 0 | 1 | 0 |
| 7040 | 0 | 1 | 1 | 11 | 0 | 0 | 1 | 0 | No | No | ... | Month-to-month | Yes | Electronic check | 29.60 | 346.45 | No | 0 | 1 | 0 | 0 |
| 7041 | 1 | 1 | 0 | 4 | 1 | 1 | 0 | 0 | No | No | ... | Month-to-month | Yes | Mailed check | 74.40 | 306.6 | Yes | 1 | 0 | 1 | 0 |
| 7042 | 0 | 0 | 0 | 66 | 1 | 0 | 1 | 0 | Yes | Yes | ... | Two year | Yes | Bank transfer (automatic) | 105.65 | 6844.5 | No | 1 | 0 | 1 | 0 |
7043 rows × 22 columns
df['DeviceProtection'].isnull().sum()
0
df['DeviceProtection'].unique()
array(['No', 'Yes', 'No internet service'], dtype=object)
df.loc[df['DeviceProtection'] == "No internet service", "DeviceProtection"] = "No"
df['DeviceProtection'].unique()
array(['No', 'Yes'], dtype=object)
df['DeviceProtection'] = le.fit_transform(df['DeviceProtection'])
df
| SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | OnlineSecurity | OnlineBackup | DeviceProtection | TechSupport | ... | Contract | PaperlessBilling | PaymentMethod | MonthlyCharges | TotalCharges | Churn | gender_Male | InternetService_DSL | InternetService_Fiber optic | InternetService_No | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | No | ... | Month-to-month | Yes | Electronic check | 29.85 | 29.85 | No | 0 | 1 | 0 | 0 |
| 1 | 0 | 0 | 0 | 34 | 1 | 0 | 1 | 0 | 1 | No | ... | One year | No | Mailed check | 56.95 | 1889.5 | No | 1 | 1 | 0 | 0 |
| 2 | 0 | 0 | 0 | 2 | 1 | 0 | 1 | 1 | 0 | No | ... | Month-to-month | Yes | Mailed check | 53.85 | 108.15 | Yes | 1 | 1 | 0 | 0 |
| 3 | 0 | 0 | 0 | 45 | 0 | 0 | 1 | 0 | 1 | Yes | ... | One year | No | Bank transfer (automatic) | 42.30 | 1840.75 | No | 1 | 1 | 0 | 0 |
| 4 | 0 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | No | ... | Month-to-month | Yes | Electronic check | 70.70 | 151.65 | Yes | 0 | 0 | 1 | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 7038 | 0 | 1 | 1 | 24 | 1 | 1 | 1 | 0 | 1 | Yes | ... | One year | Yes | Mailed check | 84.80 | 1990.5 | No | 1 | 1 | 0 | 0 |
| 7039 | 0 | 1 | 1 | 72 | 1 | 1 | 0 | 1 | 1 | No | ... | One year | Yes | Credit card (automatic) | 103.20 | 7362.9 | No | 0 | 0 | 1 | 0 |
| 7040 | 0 | 1 | 1 | 11 | 0 | 0 | 1 | 0 | 0 | No | ... | Month-to-month | Yes | Electronic check | 29.60 | 346.45 | No | 0 | 1 | 0 | 0 |
| 7041 | 1 | 1 | 0 | 4 | 1 | 1 | 0 | 0 | 0 | No | ... | Month-to-month | Yes | Mailed check | 74.40 | 306.6 | Yes | 1 | 0 | 1 | 0 |
| 7042 | 0 | 0 | 0 | 66 | 1 | 0 | 1 | 0 | 1 | Yes | ... | Two year | Yes | Bank transfer (automatic) | 105.65 | 6844.5 | No | 1 | 0 | 1 | 0 |
7043 rows × 22 columns
df['TechSupport'].isnull().sum()
0
df['TechSupport'].unique()
array(['No', 'Yes', 'No internet service'], dtype=object)
df.loc[df['TechSupport'] == "No internet service", "TechSupport"] = "No"
df['TechSupport'].unique()
array(['No', 'Yes'], dtype=object)
df['TechSupport'] = le.fit_transform(df['TechSupport'])
df
| SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | OnlineSecurity | OnlineBackup | DeviceProtection | TechSupport | ... | Contract | PaperlessBilling | PaymentMethod | MonthlyCharges | TotalCharges | Churn | gender_Male | InternetService_DSL | InternetService_Fiber optic | InternetService_No | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | ... | Month-to-month | Yes | Electronic check | 29.85 | 29.85 | No | 0 | 1 | 0 | 0 |
| 1 | 0 | 0 | 0 | 34 | 1 | 0 | 1 | 0 | 1 | 0 | ... | One year | No | Mailed check | 56.95 | 1889.5 | No | 1 | 1 | 0 | 0 |
| 2 | 0 | 0 | 0 | 2 | 1 | 0 | 1 | 1 | 0 | 0 | ... | Month-to-month | Yes | Mailed check | 53.85 | 108.15 | Yes | 1 | 1 | 0 | 0 |
| 3 | 0 | 0 | 0 | 45 | 0 | 0 | 1 | 0 | 1 | 1 | ... | One year | No | Bank transfer (automatic) | 42.30 | 1840.75 | No | 1 | 1 | 0 | 0 |
| 4 | 0 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | ... | Month-to-month | Yes | Electronic check | 70.70 | 151.65 | Yes | 0 | 0 | 1 | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 7038 | 0 | 1 | 1 | 24 | 1 | 1 | 1 | 0 | 1 | 1 | ... | One year | Yes | Mailed check | 84.80 | 1990.5 | No | 1 | 1 | 0 | 0 |
| 7039 | 0 | 1 | 1 | 72 | 1 | 1 | 0 | 1 | 1 | 0 | ... | One year | Yes | Credit card (automatic) | 103.20 | 7362.9 | No | 0 | 0 | 1 | 0 |
| 7040 | 0 | 1 | 1 | 11 | 0 | 0 | 1 | 0 | 0 | 0 | ... | Month-to-month | Yes | Electronic check | 29.60 | 346.45 | No | 0 | 1 | 0 | 0 |
| 7041 | 1 | 1 | 0 | 4 | 1 | 1 | 0 | 0 | 0 | 0 | ... | Month-to-month | Yes | Mailed check | 74.40 | 306.6 | Yes | 1 | 0 | 1 | 0 |
| 7042 | 0 | 0 | 0 | 66 | 1 | 0 | 1 | 0 | 1 | 1 | ... | Two year | Yes | Bank transfer (automatic) | 105.65 | 6844.5 | No | 1 | 0 | 1 | 0 |
7043 rows × 22 columns
df.dtypes
SeniorCitizen int64 Partner int32 Dependents int32 tenure int64 PhoneService int32 MultipleLines int32 OnlineSecurity int32 OnlineBackup int32 DeviceProtection int32 TechSupport int32 StreamingTV object StreamingMovies object Contract object PaperlessBilling object PaymentMethod object MonthlyCharges float64 TotalCharges object Churn object gender_Male int32 InternetService_DSL int32 InternetService_Fiber optic int32 InternetService_No int32 dtype: object
df['StreamingTV'].isnull().sum()
0
df['StreamingTV'].unique()
array(['No', 'Yes', 'No internet service'], dtype=object)
df.loc[df['StreamingTV'] == "No internet service", "StreamingTV"] = "No"
df['StreamingTV'].unique()
array(['No', 'Yes'], dtype=object)
df['StreamingTV'] = le.fit_transform(df['StreamingTV'])
df['StreamingTV']
0 0
1 0
2 0
3 0
4 0
..
7038 1
7039 1
7040 0
7041 0
7042 1
Name: StreamingTV, Length: 7043, dtype: int32
df['StreamingMovies'].isnull().sum()
0
df['StreamingMovies'].unique()
array(['No', 'Yes', 'No internet service'], dtype=object)
df.loc[df['StreamingMovies'] == "No internet service", "StreamingMovies"] = "No"
df['StreamingMovies'].unique()
array(['No', 'Yes'], dtype=object)
df['StreamingMovies'] = le.fit_transform(df['StreamingMovies'])
df['StreamingMovies']
0 0
1 0
2 0
3 0
4 0
..
7038 1
7039 1
7040 0
7041 0
7042 1
Name: StreamingMovies, Length: 7043, dtype: int32
df['Contract'].isnull().sum()
0
df['Contract'].unique()
array(['Month-to-month', 'One year', 'Two year'], dtype=object)
We've to perform one hot encoding here...
contract = pd.get_dummies(df['Contract'], prefix = 'Contract', dtype = 'int' )
contract
| Contract_Month-to-month | Contract_One year | Contract_Two year | |
|---|---|---|---|
| 0 | 1 | 0 | 0 |
| 1 | 0 | 1 | 0 |
| 2 | 1 | 0 | 0 |
| 3 | 0 | 1 | 0 |
| 4 | 1 | 0 | 0 |
| ... | ... | ... | ... |
| 7038 | 0 | 1 | 0 |
| 7039 | 0 | 1 | 0 |
| 7040 | 1 | 0 | 0 |
| 7041 | 1 | 0 | 0 |
| 7042 | 0 | 0 | 1 |
7043 rows × 3 columns
df = df.drop(['Contract'], axis = 1)
df = pd.concat((df, contract), axis = 1)
df
| SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | OnlineSecurity | OnlineBackup | DeviceProtection | TechSupport | ... | MonthlyCharges | TotalCharges | Churn | gender_Male | InternetService_DSL | InternetService_Fiber optic | InternetService_No | Contract_Month-to-month | Contract_One year | Contract_Two year | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | ... | 29.85 | 29.85 | No | 0 | 1 | 0 | 0 | 1 | 0 | 0 |
| 1 | 0 | 0 | 0 | 34 | 1 | 0 | 1 | 0 | 1 | 0 | ... | 56.95 | 1889.5 | No | 1 | 1 | 0 | 0 | 0 | 1 | 0 |
| 2 | 0 | 0 | 0 | 2 | 1 | 0 | 1 | 1 | 0 | 0 | ... | 53.85 | 108.15 | Yes | 1 | 1 | 0 | 0 | 1 | 0 | 0 |
| 3 | 0 | 0 | 0 | 45 | 0 | 0 | 1 | 0 | 1 | 1 | ... | 42.30 | 1840.75 | No | 1 | 1 | 0 | 0 | 0 | 1 | 0 |
| 4 | 0 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | ... | 70.70 | 151.65 | Yes | 0 | 0 | 1 | 0 | 1 | 0 | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 7038 | 0 | 1 | 1 | 24 | 1 | 1 | 1 | 0 | 1 | 1 | ... | 84.80 | 1990.5 | No | 1 | 1 | 0 | 0 | 0 | 1 | 0 |
| 7039 | 0 | 1 | 1 | 72 | 1 | 1 | 0 | 1 | 1 | 0 | ... | 103.20 | 7362.9 | No | 0 | 0 | 1 | 0 | 0 | 1 | 0 |
| 7040 | 0 | 1 | 1 | 11 | 0 | 0 | 1 | 0 | 0 | 0 | ... | 29.60 | 346.45 | No | 0 | 1 | 0 | 0 | 1 | 0 | 0 |
| 7041 | 1 | 1 | 0 | 4 | 1 | 1 | 0 | 0 | 0 | 0 | ... | 74.40 | 306.6 | Yes | 1 | 0 | 1 | 0 | 1 | 0 | 0 |
| 7042 | 0 | 0 | 0 | 66 | 1 | 0 | 1 | 0 | 1 | 1 | ... | 105.65 | 6844.5 | No | 1 | 0 | 1 | 0 | 0 | 0 | 1 |
7043 rows × 24 columns
df['PaperlessBilling'].isnull().sum()
0
df['PaperlessBilling'].unique()
array(['Yes', 'No'], dtype=object)
df['PaperlessBilling'] = le.fit_transform(df['PaperlessBilling'])
df['PaperlessBilling']
0 1
1 0
2 1
3 0
4 1
..
7038 1
7039 1
7040 1
7041 1
7042 1
Name: PaperlessBilling, Length: 7043, dtype: int32
df['PaymentMethod'].isnull().sum()
0
df['PaymentMethod'].unique()
array(['Electronic check', 'Mailed check', 'Bank transfer (automatic)',
'Credit card (automatic)'], dtype=object)
PaymentMethod = pd.get_dummies(df['PaymentMethod'], prefix = 'PaymentMethod', dtype = 'int')
PaymentMethod
| PaymentMethod_Bank transfer (automatic) | PaymentMethod_Credit card (automatic) | PaymentMethod_Electronic check | PaymentMethod_Mailed check | |
|---|---|---|---|---|
| 0 | 0 | 0 | 1 | 0 |
| 1 | 0 | 0 | 0 | 1 |
| 2 | 0 | 0 | 0 | 1 |
| 3 | 1 | 0 | 0 | 0 |
| 4 | 0 | 0 | 1 | 0 |
| ... | ... | ... | ... | ... |
| 7038 | 0 | 0 | 0 | 1 |
| 7039 | 0 | 1 | 0 | 0 |
| 7040 | 0 | 0 | 1 | 0 |
| 7041 | 0 | 0 | 0 | 1 |
| 7042 | 1 | 0 | 0 | 0 |
7043 rows × 4 columns
df = df.drop(['PaymentMethod'], axis = 1)
df = pd.concat((df, PaymentMethod), axis = 1)
df
| SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | OnlineSecurity | OnlineBackup | DeviceProtection | TechSupport | ... | InternetService_DSL | InternetService_Fiber optic | InternetService_No | Contract_Month-to-month | Contract_One year | Contract_Two year | PaymentMethod_Bank transfer (automatic) | PaymentMethod_Credit card (automatic) | PaymentMethod_Electronic check | PaymentMethod_Mailed check | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | ... | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 |
| 1 | 0 | 0 | 0 | 34 | 1 | 0 | 1 | 0 | 1 | 0 | ... | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 |
| 2 | 0 | 0 | 0 | 2 | 1 | 0 | 1 | 1 | 0 | 0 | ... | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 |
| 3 | 0 | 0 | 0 | 45 | 0 | 0 | 1 | 0 | 1 | 1 | ... | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 |
| 4 | 0 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 7038 | 0 | 1 | 1 | 24 | 1 | 1 | 1 | 0 | 1 | 1 | ... | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 |
| 7039 | 0 | 1 | 1 | 72 | 1 | 1 | 0 | 1 | 1 | 0 | ... | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 |
| 7040 | 0 | 1 | 1 | 11 | 0 | 0 | 1 | 0 | 0 | 0 | ... | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 |
| 7041 | 1 | 1 | 0 | 4 | 1 | 1 | 0 | 0 | 0 | 0 | ... | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 |
| 7042 | 0 | 0 | 0 | 66 | 1 | 0 | 1 | 0 | 1 | 1 | ... | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 |
7043 rows × 27 columns
df.dtypes
SeniorCitizen int64 Partner int32 Dependents int32 tenure int64 PhoneService int32 MultipleLines int32 OnlineSecurity int32 OnlineBackup int32 DeviceProtection int32 TechSupport int32 StreamingTV int32 StreamingMovies int32 PaperlessBilling int32 MonthlyCharges float64 TotalCharges object Churn object gender_Male int32 InternetService_DSL int32 InternetService_Fiber optic int32 InternetService_No int32 Contract_Month-to-month int32 Contract_One year int32 Contract_Two year int32 PaymentMethod_Bank transfer (automatic) int32 PaymentMethod_Credit card (automatic) int32 PaymentMethod_Electronic check int32 PaymentMethod_Mailed check int32 dtype: object
df['TotalCharges'].isnull().sum()
0
df['TotalCharges']
0 29.85
1 1889.5
2 108.15
3 1840.75
4 151.65
...
7038 1990.5
7039 7362.9
7040 346.45
7041 306.6
7042 6844.5
Name: TotalCharges, Length: 7043, dtype: object
From the above output, we see that all are numbers(float). But, dtype tells it is an object. So, we have to convert from object type to float.
#df['TotalCharges'] = pd.to_numeric(df['TotalCharges'])
df['TotalCharges'][488]
' '
df.iloc[488]
SeniorCitizen 0 Partner 1 Dependents 1 tenure 0 PhoneService 0 MultipleLines 0 OnlineSecurity 1 OnlineBackup 0 DeviceProtection 1 TechSupport 1 StreamingTV 1 StreamingMovies 0 PaperlessBilling 1 MonthlyCharges 52.55 TotalCharges Churn No gender_Male 0 InternetService_DSL 1 InternetService_Fiber optic 0 InternetService_No 0 Contract_Month-to-month 0 Contract_One year 0 Contract_Two year 1 PaymentMethod_Bank transfer (automatic) 1 PaymentMethod_Credit card (automatic) 0 PaymentMethod_Electronic check 0 PaymentMethod_Mailed check 0 Name: 488, dtype: object
pd.set_option('display.max_columns', None)
df.where(df['tenure'] == 0).dropna()
| SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | OnlineSecurity | OnlineBackup | DeviceProtection | TechSupport | StreamingTV | StreamingMovies | PaperlessBilling | MonthlyCharges | TotalCharges | Churn | gender_Male | InternetService_DSL | InternetService_Fiber optic | InternetService_No | Contract_Month-to-month | Contract_One year | Contract_Two year | PaymentMethod_Bank transfer (automatic) | PaymentMethod_Credit card (automatic) | PaymentMethod_Electronic check | PaymentMethod_Mailed check | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 488 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 0.0 | 1.0 | 52.55 | No | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | |
| 753 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 20.25 | No | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | |
| 936 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 80.85 | No | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | |
| 1082 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 25.75 | No | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | |
| 1340 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.0 | 0.0 | 56.05 | No | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | |
| 3331 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 19.85 | No | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | |
| 3826 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 25.35 | No | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | |
| 4380 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 20.00 | No | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | |
| 5218 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 19.70 | No | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | |
| 6670 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.0 | 0.0 | 73.35 | No | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | |
| 6754 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 61.90 | No | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 |
to_change = df.where(df['TotalCharges'] == ' ').dropna()
to_change.index
Index([488, 753, 936, 1082, 1340, 3331, 3826, 4380, 5218, 6670, 6754], dtype='int64')
for x in to_change.index:
df.loc[df['TotalCharges'] == " ", "TotalCharges"] = df.iloc[x]['MonthlyCharges']
df.where(df['tenure'] == 0).dropna()
| SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | OnlineSecurity | OnlineBackup | DeviceProtection | TechSupport | StreamingTV | StreamingMovies | PaperlessBilling | MonthlyCharges | TotalCharges | Churn | gender_Male | InternetService_DSL | InternetService_Fiber optic | InternetService_No | Contract_Month-to-month | Contract_One year | Contract_Two year | PaymentMethod_Bank transfer (automatic) | PaymentMethod_Credit card (automatic) | PaymentMethod_Electronic check | PaymentMethod_Mailed check | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 488 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 0.0 | 1.0 | 52.55 | 52.55 | No | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 |
| 753 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 20.25 | 52.55 | No | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| 936 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 80.85 | 52.55 | No | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| 1082 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 25.75 | 52.55 | No | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| 1340 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.0 | 0.0 | 56.05 | 52.55 | No | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 |
| 3331 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 19.85 | 52.55 | No | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| 3826 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 25.35 | 52.55 | No | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| 4380 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 20.00 | 52.55 | No | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| 5218 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 19.70 | 52.55 | No | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| 6670 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.0 | 0.0 | 73.35 | 52.55 | No | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| 6754 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 61.90 | 52.55 | No | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 |
df['TotalCharges'] = pd.to_numeric(df['TotalCharges'])
df.dtypes
SeniorCitizen int64 Partner int32 Dependents int32 tenure int64 PhoneService int32 MultipleLines int32 OnlineSecurity int32 OnlineBackup int32 DeviceProtection int32 TechSupport int32 StreamingTV int32 StreamingMovies int32 PaperlessBilling int32 MonthlyCharges float64 TotalCharges float64 Churn object gender_Male int32 InternetService_DSL int32 InternetService_Fiber optic int32 InternetService_No int32 Contract_Month-to-month int32 Contract_One year int32 Contract_Two year int32 PaymentMethod_Bank transfer (automatic) int32 PaymentMethod_Credit card (automatic) int32 PaymentMethod_Electronic check int32 PaymentMethod_Mailed check int32 dtype: object
df['Churn'].isnull().sum()
0
df['Churn'].unique()
array(['No', 'Yes'], dtype=object)
df['Churn'] = le.fit_transform(df['Churn'])
df
| SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | OnlineSecurity | OnlineBackup | DeviceProtection | TechSupport | StreamingTV | StreamingMovies | PaperlessBilling | MonthlyCharges | TotalCharges | Churn | gender_Male | InternetService_DSL | InternetService_Fiber optic | InternetService_No | Contract_Month-to-month | Contract_One year | Contract_Two year | PaymentMethod_Bank transfer (automatic) | PaymentMethod_Credit card (automatic) | PaymentMethod_Electronic check | PaymentMethod_Mailed check | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 29.85 | 29.85 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 |
| 1 | 0 | 0 | 0 | 34 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 56.95 | 1889.50 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 |
| 2 | 0 | 0 | 0 | 2 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 53.85 | 108.15 | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 |
| 3 | 0 | 0 | 0 | 45 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 42.30 | 1840.75 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 |
| 4 | 0 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 70.70 | 151.65 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 7038 | 0 | 1 | 1 | 24 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 84.80 | 1990.50 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 |
| 7039 | 0 | 1 | 1 | 72 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 1 | 1 | 103.20 | 7362.90 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 |
| 7040 | 0 | 1 | 1 | 11 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 29.60 | 346.45 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 |
| 7041 | 1 | 1 | 0 | 4 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 74.40 | 306.60 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 |
| 7042 | 0 | 0 | 0 | 66 | 1 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 105.65 | 6844.50 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 |
7043 rows × 27 columns
df.dtypes
SeniorCitizen int64 Partner int32 Dependents int32 tenure int64 PhoneService int32 MultipleLines int32 OnlineSecurity int32 OnlineBackup int32 DeviceProtection int32 TechSupport int32 StreamingTV int32 StreamingMovies int32 PaperlessBilling int32 MonthlyCharges float64 TotalCharges float64 Churn int32 gender_Male int32 InternetService_DSL int32 InternetService_Fiber optic int32 InternetService_No int32 Contract_Month-to-month int32 Contract_One year int32 Contract_Two year int32 PaymentMethod_Bank transfer (automatic) int32 PaymentMethod_Credit card (automatic) int32 PaymentMethod_Electronic check int32 PaymentMethod_Mailed check int32 dtype: object
fig = plx.histogram(df, x = 'Churn', title = "Churn", color = 'Churn')
fig.update_traces(dict(marker_line_width=0))
fig.show()
temp_df = pd.read_csv('WA_Fn-UseC_-Telco-Customer-Churn.csv')
temp_df.dtypes
customerID object gender object SeniorCitizen int64 Partner object Dependents object tenure int64 PhoneService object MultipleLines object InternetService object OnlineSecurity object OnlineBackup object DeviceProtection object TechSupport object StreamingTV object StreamingMovies object Contract object PaperlessBilling object PaymentMethod object MonthlyCharges float64 TotalCharges object Churn object dtype: object
import plotly.express as plx
fig = plx.bar(temp_df, x = 'gender', color = 'Churn')
fig.update_traces(dict(marker_line_width=0))
fig.show()
fig = plx.histogram(temp_df, x = 'SeniorCitizen', color = 'Churn')
fig.update_traces(dict(marker_line_width=0))
fig.show()
fig = plx.histogram(temp_df, x = 'Partner', color = 'Churn')
fig.update_traces(dict(marker_line_width=0))
fig.show()
fig = plx.histogram(temp_df, x = 'Dependents', color = 'Churn')
fig.update_traces(dict(marker_line_width=0))
fig.show()
fig = plx.scatter(temp_df, y = 'tenure', color = 'Churn')
fig.update_traces(dict(marker_line_width=0))
fig.show()
This scatter plot is confusing a lot. So, we're going to separe the tenure value based on churn value.
tenure_churn_yes, tenure_churn_no = [], []
for i in range(len(df)):
tenure = temp_df.iloc[i]['tenure']
churn = temp_df.iloc[i]['Churn']
if churn == 'Yes':
tenure_churn_yes.append(tenure)
else:
tenure_churn_no.append(tenure)
fig = plx.scatter(y = tenure_churn_yes, title = 'Churn with tenure')
fig.update_traces(dict(marker_line_width=0))
fig.show()
fig = plx.scatter(y = tenure_churn_no, title = 'No Churn with Tenure')
fig.update_traces(dict(marker_line_width=0))
fig.show()
fig = plx.histogram(temp_df, x = 'PhoneService', color = 'Churn')
fig.update_traces(dict(marker_line_width=0))
fig.show()
temp_df.dtypes
customerID object gender object SeniorCitizen int64 Partner object Dependents object tenure int64 PhoneService object MultipleLines object InternetService object OnlineSecurity object OnlineBackup object DeviceProtection object TechSupport object StreamingTV object StreamingMovies object Contract object PaperlessBilling object PaymentMethod object MonthlyCharges float64 TotalCharges object Churn object dtype: object
temp_df['MultipleLines'].unique()
array(['No phone service', 'No', 'Yes'], dtype=object)
temp_df.loc[temp_df['MultipleLines'] == 'No phone service', 'MultipleLines'] = 'No'
temp_df['MultipleLines'].unique()
array(['No', 'Yes'], dtype=object)
fig = plx.histogram(temp_df, x = 'MultipleLines', color = 'Churn')
fig.update_traces(dict(marker_line_width=0))
fig.show()
temp_df['InternetService'].unique()
array(['DSL', 'Fiber optic', 'No'], dtype=object)
fig = plx.histogram(temp_df, x = 'InternetService', color = 'Churn')
fig.update_traces(dict(marker_line_width=0))
fig.show()
temp_df['OnlineSecurity'].unique()
array(['No', 'Yes', 'No internet service'], dtype=object)
temp_df.loc[temp_df['OnlineSecurity'] == 'No internet service', 'OnlineSecurity'] = 'No'
temp_df['OnlineSecurity'].unique()
array(['No', 'Yes'], dtype=object)
fig = plx.histogram(temp_df, x = 'OnlineSecurity', color = 'Churn')
fig.update_traces(dict(marker_line_width=0))
fig.show()
fig = plx.scatter(temp_df, x = 'MonthlyCharges', color = 'Churn')
fig.show()
fig = plx.scatter(temp_df, x = 'TotalCharges', color = 'Churn')
fig.show()
plx.imshow(df.corr(), text_auto = True, height = 1750, width = 1750)
temp_df['Churn'].unique()
array(['No', 'Yes'], dtype=object)
churn_yes_df = temp_df.where((temp_df['Churn'] == 'Yes')).dropna()
churn_yes_df
| customerID | gender | SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | InternetService | OnlineSecurity | OnlineBackup | DeviceProtection | TechSupport | StreamingTV | StreamingMovies | Contract | PaperlessBilling | PaymentMethod | MonthlyCharges | TotalCharges | Churn | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 3668-QPYBK | Male | 0.0 | No | No | 2.0 | Yes | No | DSL | Yes | Yes | No | No | No | No | Month-to-month | Yes | Mailed check | 53.85 | 108.15 | Yes |
| 4 | 9237-HQITU | Female | 0.0 | No | No | 2.0 | Yes | No | Fiber optic | No | No | No | No | No | No | Month-to-month | Yes | Electronic check | 70.70 | 151.65 | Yes |
| 5 | 9305-CDSKC | Female | 0.0 | No | No | 8.0 | Yes | Yes | Fiber optic | No | No | Yes | No | Yes | Yes | Month-to-month | Yes | Electronic check | 99.65 | 820.5 | Yes |
| 8 | 7892-POOKP | Female | 0.0 | Yes | No | 28.0 | Yes | Yes | Fiber optic | No | No | Yes | Yes | Yes | Yes | Month-to-month | Yes | Electronic check | 104.80 | 3046.05 | Yes |
| 13 | 0280-XJGEX | Male | 0.0 | No | No | 49.0 | Yes | Yes | Fiber optic | No | Yes | Yes | No | Yes | Yes | Month-to-month | Yes | Bank transfer (automatic) | 103.70 | 5036.3 | Yes |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 7021 | 1699-HPSBG | Male | 0.0 | No | No | 12.0 | Yes | No | DSL | No | No | No | Yes | Yes | No | One year | Yes | Electronic check | 59.80 | 727.8 | Yes |
| 7026 | 8775-CEBBJ | Female | 0.0 | No | No | 9.0 | Yes | No | DSL | No | No | No | No | No | No | Month-to-month | Yes | Bank transfer (automatic) | 44.20 | 403.35 | Yes |
| 7032 | 6894-LFHLY | Male | 1.0 | No | No | 1.0 | Yes | Yes | Fiber optic | No | No | No | No | No | No | Month-to-month | Yes | Electronic check | 75.75 | 75.75 | Yes |
| 7034 | 0639-TSIQW | Female | 0.0 | No | No | 67.0 | Yes | Yes | Fiber optic | Yes | Yes | Yes | No | Yes | No | Month-to-month | Yes | Credit card (automatic) | 102.95 | 6886.25 | Yes |
| 7041 | 8361-LTMKD | Male | 1.0 | Yes | No | 4.0 | Yes | Yes | Fiber optic | No | No | No | No | No | No | Month-to-month | Yes | Mailed check | 74.40 | 306.6 | Yes |
1869 rows × 21 columns
Thereare 1869 churn customers...
churn_yes_df['Dependents'].value_counts()
Dependents No 1543 Yes 326 Name: count, dtype: int64
fig = plx.histogram(churn_yes_df, x = 'Dependents', title = "Churn on Dependents")
fig.update_traces(dict(marker_line_width=0))
fig.show()
churn_yes_df['tenure'].value_counts()
tenure
1.0 380
2.0 123
3.0 94
4.0 83
5.0 64
...
60.0 6
72.0 6
62.0 5
64.0 4
63.0 4
Name: count, Length: 72, dtype: int64
plx.histogram(x = churn_yes_df['tenure'])
churn_yes_df['PhoneService'].value_counts()
PhoneService Yes 1699 No 170 Name: count, dtype: int64
fig = plx.histogram(churn_yes_df, x = 'PhoneService', title = "Churn on PhoneService")
fig.update_traces(dict(marker_line_width=0))
fig.show()
churn_yes_df['InternetService'].value_counts()
InternetService Fiber optic 1297 DSL 459 No 113 Name: count, dtype: int64
fig = plx.histogram(churn_yes_df, x = 'InternetService', title = "Churn on InternetService")
fig.update_traces(dict(marker_line_width=0))
fig.show()
churn_yes_df['Contract'].value_counts()
Contract Month-to-month 1655 One year 166 Two year 48 Name: count, dtype: int64
fig = plx.histogram(churn_yes_df, x = 'Contract', title = "Churn on Contract")
fig.update_traces(dict(marker_line_width=0))
fig.show()
churn_yes_df['PaymentMethod'].value_counts()
PaymentMethod Electronic check 1071 Mailed check 308 Bank transfer (automatic) 258 Credit card (automatic) 232 Name: count, dtype: int64
fig = plx.histogram(churn_yes_df, x = 'PaymentMethod', title = "PaymentMethod")
fig.update_traces(dict(marker_line_width=0))
fig.show()
df
| SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | OnlineSecurity | OnlineBackup | DeviceProtection | TechSupport | StreamingTV | StreamingMovies | PaperlessBilling | MonthlyCharges | TotalCharges | Churn | gender_Male | InternetService_DSL | InternetService_Fiber optic | InternetService_No | Contract_Month-to-month | Contract_One year | Contract_Two year | PaymentMethod_Bank transfer (automatic) | PaymentMethod_Credit card (automatic) | PaymentMethod_Electronic check | PaymentMethod_Mailed check | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 29.85 | 29.85 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 |
| 1 | 0 | 0 | 0 | 34 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 56.95 | 1889.50 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 |
| 2 | 0 | 0 | 0 | 2 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 53.85 | 108.15 | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 |
| 3 | 0 | 0 | 0 | 45 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 42.30 | 1840.75 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 |
| 4 | 0 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 70.70 | 151.65 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 7038 | 0 | 1 | 1 | 24 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 84.80 | 1990.50 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 |
| 7039 | 0 | 1 | 1 | 72 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 1 | 1 | 103.20 | 7362.90 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 |
| 7040 | 0 | 1 | 1 | 11 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 29.60 | 346.45 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 |
| 7041 | 1 | 1 | 0 | 4 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 74.40 | 306.60 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 |
| 7042 | 0 | 0 | 0 | 66 | 1 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 105.65 | 6844.50 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 |
7043 rows × 27 columns
from sklearn.preprocessing import MinMaxScaler
mms = MinMaxScaler()
df_new = mms.fit_transform(df)
df = pd.DataFrame(df_new, columns = df.columns)
df
| SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | OnlineSecurity | OnlineBackup | DeviceProtection | TechSupport | StreamingTV | StreamingMovies | PaperlessBilling | MonthlyCharges | TotalCharges | Churn | gender_Male | InternetService_DSL | InternetService_Fiber optic | InternetService_No | Contract_Month-to-month | Contract_One year | Contract_Two year | PaymentMethod_Bank transfer (automatic) | PaymentMethod_Credit card (automatic) | PaymentMethod_Electronic check | PaymentMethod_Mailed check | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.0 | 1.0 | 0.0 | 0.013889 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.115423 | 0.001275 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
| 1 | 0.0 | 0.0 | 0.0 | 0.472222 | 1.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.385075 | 0.215867 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| 2 | 0.0 | 0.0 | 0.0 | 0.027778 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.354229 | 0.010310 | 1.0 | 1.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| 3 | 0.0 | 0.0 | 0.0 | 0.625000 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.239303 | 0.210241 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 |
| 4 | 0.0 | 0.0 | 0.0 | 0.027778 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.521891 | 0.015330 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 7038 | 0.0 | 1.0 | 1.0 | 0.333333 | 1.0 | 1.0 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.662189 | 0.227521 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| 7039 | 0.0 | 1.0 | 1.0 | 1.000000 | 1.0 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 0.845274 | 0.847461 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 |
| 7040 | 0.0 | 1.0 | 1.0 | 0.152778 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.112935 | 0.037809 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
| 7041 | 1.0 | 1.0 | 0.0 | 0.055556 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.558706 | 0.033210 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| 7042 | 0.0 | 0.0 | 0.0 | 0.916667 | 1.0 | 0.0 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.869652 | 0.787641 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 |
7043 rows × 27 columns
Now, everything sacled in the range of 0 to 1.
from keras.layers import Dense,Dropout
from keras import Sequential
ANN_model = Sequential()
# Adding Input Layer to ANN
ANN_model.add(Dense(units = 27, activation = 'relu'))
# Adding 1st Hidden Layer to the ANN
ANN_model.add(Dense(units = 15, activation = 'relu'))
ANN_model.add(Dropout(0.4))
# Adding 2nd Hidden Layer to the ANN
ANN_model.add(Dense(units = 7, activation = 'relu'))
ANN_model.add(Dropout(0.3))
# Adding Output Layer to the ANN
ANN_model.add(Dense(units = 1, activation = 'sigmoid'))
ANN_model.compile(optimizer = 'adam',
loss = 'binary_crossentropy',
metrics = ['accuracy'])
x_train, x_test, y_train, y_test = train_test_split( df.drop(['Churn'], axis = 1), df['Churn'], test_size = 0.2, random_state = 35)
import tensorflow as tf
early_stopping = tf.keras.callbacks.EarlyStopping(
monitor="accuracy",
min_delta=0.0001,
patience=20,
verbose=1,
mode="auto",
baseline=None,
restore_best_weights=True
)
model_history = ANN_model.fit(x_train, y_train, batch_size = 2, epochs = 150, validation_data = (x_test,y_test), callbacks = early_stopping )
Epoch 1/150 2817/2817 [==============================] - 14s 4ms/step - loss: 0.5081 - accuracy: 0.7469 - val_loss: 0.4054 - val_accuracy: 0.8070 Epoch 2/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4685 - accuracy: 0.7760 - val_loss: 0.3948 - val_accuracy: 0.8148 Epoch 3/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4550 - accuracy: 0.7812 - val_loss: 0.4082 - val_accuracy: 0.8204 Epoch 4/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4552 - accuracy: 0.7849 - val_loss: 0.3983 - val_accuracy: 0.8098 Epoch 5/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4506 - accuracy: 0.7863 - val_loss: 0.3942 - val_accuracy: 0.8211 Epoch 6/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4410 - accuracy: 0.7973 - val_loss: 0.4095 - val_accuracy: 0.8070 Epoch 7/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4435 - accuracy: 0.7868 - val_loss: 0.3960 - val_accuracy: 0.8112 Epoch 8/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4451 - accuracy: 0.7918 - val_loss: 0.3951 - val_accuracy: 0.8098 Epoch 9/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4426 - accuracy: 0.7875 - val_loss: 0.3963 - val_accuracy: 0.8176 Epoch 10/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4388 - accuracy: 0.7874 - val_loss: 0.3956 - val_accuracy: 0.8126 Epoch 11/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4392 - accuracy: 0.7948 - val_loss: 0.3959 - val_accuracy: 0.8112 Epoch 12/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4345 - accuracy: 0.7897 - val_loss: 0.3897 - val_accuracy: 0.8219 Epoch 13/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4337 - accuracy: 0.7927 - val_loss: 0.3914 - val_accuracy: 0.8233 Epoch 14/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4370 - accuracy: 0.7923 - val_loss: 0.3917 - val_accuracy: 0.8169 Epoch 15/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4358 - accuracy: 0.7904 - val_loss: 0.3904 - val_accuracy: 0.8247 Epoch 16/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4303 - accuracy: 0.7980 - val_loss: 0.4021 - val_accuracy: 0.8077 Epoch 17/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4275 - accuracy: 0.7952 - val_loss: 0.3934 - val_accuracy: 0.8183 Epoch 18/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4328 - accuracy: 0.7953 - val_loss: 0.3975 - val_accuracy: 0.8098 Epoch 19/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4275 - accuracy: 0.7948 - val_loss: 0.3975 - val_accuracy: 0.8133 Epoch 20/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4226 - accuracy: 0.7968 - val_loss: 0.3950 - val_accuracy: 0.8204 Epoch 21/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4252 - accuracy: 0.7941 - val_loss: 0.3907 - val_accuracy: 0.8233 Epoch 22/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4278 - accuracy: 0.7984 - val_loss: 0.3929 - val_accuracy: 0.8176 Epoch 23/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4219 - accuracy: 0.7985 - val_loss: 0.3975 - val_accuracy: 0.8197 Epoch 24/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4178 - accuracy: 0.7985 - val_loss: 0.3994 - val_accuracy: 0.8105 Epoch 25/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4176 - accuracy: 0.8012 - val_loss: 0.4008 - val_accuracy: 0.8041 Epoch 26/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4179 - accuracy: 0.7984 - val_loss: 0.3999 - val_accuracy: 0.8084 Epoch 27/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4179 - accuracy: 0.7991 - val_loss: 0.4014 - val_accuracy: 0.8084 Epoch 28/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4177 - accuracy: 0.8003 - val_loss: 0.4033 - val_accuracy: 0.8133 Epoch 29/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4162 - accuracy: 0.8012 - val_loss: 0.3943 - val_accuracy: 0.8155 Epoch 30/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4114 - accuracy: 0.8028 - val_loss: 0.4029 - val_accuracy: 0.8133 Epoch 31/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4127 - accuracy: 0.7941 - val_loss: 0.4018 - val_accuracy: 0.8133 Epoch 32/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4137 - accuracy: 0.8019 - val_loss: 0.4062 - val_accuracy: 0.8091 Epoch 33/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4114 - accuracy: 0.8042 - val_loss: 0.4003 - val_accuracy: 0.8126 Epoch 34/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4120 - accuracy: 0.8035 - val_loss: 0.4083 - val_accuracy: 0.8105 Epoch 35/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4114 - accuracy: 0.8035 - val_loss: 0.4066 - val_accuracy: 0.8070 Epoch 36/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4069 - accuracy: 0.8028 - val_loss: 0.4109 - val_accuracy: 0.8077 Epoch 37/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4101 - accuracy: 0.8028 - val_loss: 0.4146 - val_accuracy: 0.8084 Epoch 38/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4056 - accuracy: 0.7993 - val_loss: 0.4067 - val_accuracy: 0.8084 Epoch 39/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4087 - accuracy: 0.8039 - val_loss: 0.4095 - val_accuracy: 0.8091 Epoch 40/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4052 - accuracy: 0.8049 - val_loss: 0.4062 - val_accuracy: 0.8070 Epoch 41/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4062 - accuracy: 0.8051 - val_loss: 0.4131 - val_accuracy: 0.8126 Epoch 42/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4114 - accuracy: 0.7996 - val_loss: 0.4093 - val_accuracy: 0.8105 Epoch 43/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4053 - accuracy: 0.8055 - val_loss: 0.4035 - val_accuracy: 0.8048 Epoch 44/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4028 - accuracy: 0.8087 - val_loss: 0.4151 - val_accuracy: 0.8062 Epoch 45/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.3968 - accuracy: 0.8090 - val_loss: 0.4119 - val_accuracy: 0.8084 Epoch 46/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4073 - accuracy: 0.8069 - val_loss: 0.4066 - val_accuracy: 0.8062 Epoch 47/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4056 - accuracy: 0.8044 - val_loss: 0.4089 - val_accuracy: 0.8141 Epoch 48/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4072 - accuracy: 0.8040 - val_loss: 0.4115 - val_accuracy: 0.8105 Epoch 49/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4028 - accuracy: 0.8076 - val_loss: 0.4109 - val_accuracy: 0.8098 Epoch 50/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4027 - accuracy: 0.8074 - val_loss: 0.4141 - val_accuracy: 0.8119 Epoch 51/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4036 - accuracy: 0.8072 - val_loss: 0.4157 - val_accuracy: 0.8077 Epoch 52/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4005 - accuracy: 0.8065 - val_loss: 0.4073 - val_accuracy: 0.8070 Epoch 53/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4025 - accuracy: 0.8048 - val_loss: 0.4161 - val_accuracy: 0.8077 Epoch 54/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4024 - accuracy: 0.8042 - val_loss: 0.4077 - val_accuracy: 0.8098 Epoch 55/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4002 - accuracy: 0.8046 - val_loss: 0.4103 - val_accuracy: 0.8062 Epoch 56/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4009 - accuracy: 0.8081 - val_loss: 0.4157 - val_accuracy: 0.8119 Epoch 57/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4012 - accuracy: 0.8081 - val_loss: 0.4056 - val_accuracy: 0.8105 Epoch 58/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.3977 - accuracy: 0.8049 - val_loss: 0.4151 - val_accuracy: 0.8126 Epoch 59/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.3979 - accuracy: 0.8012 - val_loss: 0.4065 - val_accuracy: 0.8098 Epoch 60/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4016 - accuracy: 0.8051 - val_loss: 0.4178 - val_accuracy: 0.8119 Epoch 61/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.3949 - accuracy: 0.8060 - val_loss: 0.4105 - val_accuracy: 0.8098 Epoch 62/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.3983 - accuracy: 0.8065 - val_loss: 0.4164 - val_accuracy: 0.8141 Epoch 63/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.3970 - accuracy: 0.8074 - val_loss: 0.4245 - val_accuracy: 0.8148 Epoch 64/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.3977 - accuracy: 0.8064 - val_loss: 0.4140 - val_accuracy: 0.8062 Epoch 65/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.3936 - accuracy: 0.8103 - val_loss: 0.4137 - val_accuracy: 0.8055 Epoch 66/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.3942 - accuracy: 0.8074 - val_loss: 0.4251 - val_accuracy: 0.8119 Epoch 67/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.3985 - accuracy: 0.8074 - val_loss: 0.4232 - val_accuracy: 0.8077 Epoch 68/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.3867 - accuracy: 0.8117 - val_loss: 0.4289 - val_accuracy: 0.8070 Epoch 69/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.3969 - accuracy: 0.8074 - val_loss: 0.4240 - val_accuracy: 0.8062 Epoch 70/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.3921 - accuracy: 0.8094 - val_loss: 0.4273 - val_accuracy: 0.7999 Epoch 71/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.3964 - accuracy: 0.8044 - val_loss: 0.4254 - val_accuracy: 0.8091 Epoch 72/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.3910 - accuracy: 0.8064 - val_loss: 0.4204 - val_accuracy: 0.8013 Epoch 73/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.3948 - accuracy: 0.8069 - val_loss: 0.4322 - val_accuracy: 0.8126 Epoch 74/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.3934 - accuracy: 0.8129 - val_loss: 0.4299 - val_accuracy: 0.8126 Epoch 75/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.3910 - accuracy: 0.8129 - val_loss: 0.4237 - val_accuracy: 0.8112 Epoch 76/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.3934 - accuracy: 0.8106 - val_loss: 0.4285 - val_accuracy: 0.8105 Epoch 77/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.3911 - accuracy: 0.8147 - val_loss: 0.4332 - val_accuracy: 0.8070 Epoch 78/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.3852 - accuracy: 0.8159 - val_loss: 0.4346 - val_accuracy: 0.8055 Epoch 79/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.3920 - accuracy: 0.8053 - val_loss: 0.4263 - val_accuracy: 0.7991 Epoch 80/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.3880 - accuracy: 0.8119 - val_loss: 0.4288 - val_accuracy: 0.8105 Epoch 81/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.3869 - accuracy: 0.8136 - val_loss: 0.4255 - val_accuracy: 0.8126 Epoch 82/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.3885 - accuracy: 0.8113 - val_loss: 0.4404 - val_accuracy: 0.8048 Epoch 83/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.3904 - accuracy: 0.8064 - val_loss: 0.4423 - val_accuracy: 0.8091 Epoch 84/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.3873 - accuracy: 0.8124 - val_loss: 0.4267 - val_accuracy: 0.8027 Epoch 85/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.3895 - accuracy: 0.8135 - val_loss: 0.4224 - val_accuracy: 0.8112 Epoch 86/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.3815 - accuracy: 0.8152 - val_loss: 0.4383 - val_accuracy: 0.8034 Epoch 87/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.3858 - accuracy: 0.8142 - val_loss: 0.4377 - val_accuracy: 0.8070 Epoch 88/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.3856 - accuracy: 0.8151 - val_loss: 0.4360 - val_accuracy: 0.7991 Epoch 89/150 2817/2817 [==============================] - 13s 4ms/step - loss: 0.3886 - accuracy: 0.8142 - val_loss: 0.4440 - val_accuracy: 0.8013 Epoch 90/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.3824 - accuracy: 0.8156 - val_loss: 0.4518 - val_accuracy: 0.8041 Epoch 91/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.3826 - accuracy: 0.8154 - val_loss: 0.4380 - val_accuracy: 0.8055 Epoch 92/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.3834 - accuracy: 0.8131 - val_loss: 0.4475 - val_accuracy: 0.8077 Epoch 93/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.3858 - accuracy: 0.8129 - val_loss: 0.4433 - val_accuracy: 0.8041 Epoch 94/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.3822 - accuracy: 0.8101 - val_loss: 0.4455 - val_accuracy: 0.7991 Epoch 95/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.3897 - accuracy: 0.8156 - val_loss: 0.4531 - val_accuracy: 0.8041 Epoch 96/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.3843 - accuracy: 0.8074 - val_loss: 0.4401 - val_accuracy: 0.8013 Epoch 97/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.3829 - accuracy: 0.8117 - val_loss: 0.4347 - val_accuracy: 0.8034 Epoch 98/150 2805/2817 [============================>.] - ETA: 0s - loss: 0.3791 - accuracy: 0.8127Restoring model weights from the end of the best epoch: 78. 2817/2817 [==============================] - 11s 4ms/step - loss: 0.3787 - accuracy: 0.8127 - val_loss: 0.4418 - val_accuracy: 0.8077 Epoch 98: early stopping
model_history.history['val_accuracy']
[0.8069552779197693, 0.8147622346878052, 0.8204400539398193, 0.8097941875457764, 0.8211497664451599, 0.8069552779197693, 0.8112136125564575, 0.8097941875457764, 0.8176011443138123, 0.8126330971717834, 0.8112136125564575, 0.8218594789505005, 0.8232789039611816, 0.8168914318084717, 0.8246983885765076, 0.8076649904251099, 0.8183108568191528, 0.8097941875457764, 0.813342809677124, 0.8204400539398193, 0.8232789039611816, 0.8176011443138123, 0.819730281829834, 0.8105039000511169, 0.8041163682937622, 0.8083747625350952, 0.8083747625350952, 0.813342809677124, 0.8154719471931458, 0.813342809677124, 0.813342809677124, 0.8090844750404358, 0.8126330971717834, 0.8105039000511169, 0.8069552779197693, 0.8076649904251099, 0.8083747625350952, 0.8083747625350952, 0.8090844750404358, 0.8069552779197693, 0.8126330971717834, 0.8105039000511169, 0.8048261404037476, 0.8062455654144287, 0.8083747625350952, 0.8062455654144287, 0.8140525221824646, 0.8105039000511169, 0.8097941875457764, 0.8119233250617981, 0.8076649904251099, 0.8069552779197693, 0.8076649904251099, 0.8097941875457764, 0.8062455654144287, 0.8119233250617981, 0.8105039000511169, 0.8126330971717834, 0.8097941875457764, 0.8119233250617981, 0.8097941875457764, 0.8140525221824646, 0.8147622346878052, 0.8062455654144287, 0.8055358529090881, 0.8119233250617981, 0.8076649904251099, 0.8069552779197693, 0.8062455654144287, 0.799858033657074, 0.8090844750404358, 0.8012775182723999, 0.8126330971717834, 0.8126330971717834, 0.8112136125564575, 0.8105039000511169, 0.8069552779197693, 0.8055358529090881, 0.7991483211517334, 0.8105039000511169, 0.8126330971717834, 0.8048261404037476, 0.8090844750404358, 0.802696943283081, 0.8112136125564575, 0.8034066557884216, 0.8069552779197693, 0.7991483211517334, 0.8012775182723999, 0.8041163682937622, 0.8055358529090881, 0.8076649904251099, 0.8041163682937622, 0.7991483211517334, 0.8041163682937622, 0.8012775182723999, 0.8034066557884216, 0.8076649904251099]
model_history.history['accuracy']
[0.7468938827514648, 0.776002824306488, 0.7811501622200012, 0.7848775386810303, 0.7862975001335144, 0.7973020672798157, 0.786829948425293, 0.791799783706665, 0.7875399589538574, 0.7873624563217163, 0.7948172092437744, 0.7896698713302612, 0.7926872372627258, 0.7923322916030884, 0.7903798222541809, 0.7980120778083801, 0.7951721549034119, 0.795349657535553, 0.7948172092437744, 0.7967696189880371, 0.79410719871521, 0.7983670830726624, 0.7985445261001587, 0.7985445261001587, 0.8012069463729858, 0.7983670830726624, 0.799077033996582, 0.800319492816925, 0.8012069463729858, 0.8028044104576111, 0.79410719871521, 0.8019169569015503, 0.8042243719100952, 0.8035143613815308, 0.8035143613815308, 0.8028044104576111, 0.8028044104576111, 0.7992545366287231, 0.803869366645813, 0.8049343228340149, 0.805111825466156, 0.7996095418930054, 0.8054668307304382, 0.808661699295044, 0.8090167045593262, 0.8068867325782776, 0.8044018745422363, 0.8040468692779541, 0.807596743106842, 0.8074192404747009, 0.8072417378425598, 0.8065317869186401, 0.8047568202018738, 0.8042243719100952, 0.8045793175697327, 0.8081291913986206, 0.8081291913986206, 0.8049343228340149, 0.8012069463729858, 0.805111825466156, 0.8059992790222168, 0.8065317869186401, 0.8074192404747009, 0.806354284286499, 0.8102591633796692, 0.8074192404747009, 0.8074192404747009, 0.8116790652275085, 0.8074192404747009, 0.8093716502189636, 0.8044018745422363, 0.806354284286499, 0.8068867325782776, 0.8129215240478516, 0.8129215240478516, 0.8106141090393066, 0.8146964907646179, 0.8159389495849609, 0.8052893280982971, 0.8118565678596497, 0.813631534576416, 0.8113241195678711, 0.806354284286499, 0.812389075756073, 0.8134540319442749, 0.8152289390563965, 0.8141639828681946, 0.8150514960289001, 0.8141639828681946, 0.8155839443206787, 0.8154064416885376, 0.8130990266799927, 0.8129215240478516, 0.8100816607475281, 0.8155839443206787, 0.8074192404747009, 0.8116790652275085, 0.8127440810203552]
model_history.history.keys()
dict_keys(['loss', 'accuracy', 'val_loss', 'val_accuracy'])
import plotly.graph_objects as go
fig_1 = go.Figure()
fig_1.add_trace(go.Scatter(x =np.arange(0,len(model_history.history['accuracy'])),
y = model_history.history['val_accuracy'],
mode='lines+markers',
name='val_accuracy'))
fig_1.add_trace(go.Scatter(x =np.arange(0,len(model_history.history['accuracy'])),
y = model_history.history['accuracy'],
mode='lines+markers',
name='Accuracy'))
fig_1.update_layout(title = 'ACCURACY vs VALIDATION_ACCURACY')
fig_1.update_xaxes(title_text="Epochs")
fig_1.update_yaxes(title_text="Accuracy")
fig_1.show()
model_history.history['loss']
[0.5081272125244141, 0.4685124456882477, 0.4550357758998871, 0.455153226852417, 0.45062726736068726, 0.4409547448158264, 0.44353967905044556, 0.4450571835041046, 0.44258907437324524, 0.4388164281845093, 0.43916210532188416, 0.4344507157802582, 0.43371817469596863, 0.4369526207447052, 0.4357856810092926, 0.4303082823753357, 0.4274821877479553, 0.4328174889087677, 0.4274732768535614, 0.4226454794406891, 0.4251917004585266, 0.4278009831905365, 0.4219277799129486, 0.417755126953125, 0.4175887107849121, 0.41787275671958923, 0.41787877678871155, 0.4177219271659851, 0.41619035601615906, 0.41140833497047424, 0.41274720430374146, 0.4136614203453064, 0.41135138273239136, 0.4119621217250824, 0.41142210364341736, 0.40689873695373535, 0.4100908637046814, 0.40556854009628296, 0.40873223543167114, 0.4051817059516907, 0.40621814131736755, 0.41144031286239624, 0.4052787721157074, 0.40283939242362976, 0.3967612385749817, 0.4072614312171936, 0.40564730763435364, 0.4072452485561371, 0.40283632278442383, 0.4026649296283722, 0.4035778045654297, 0.4005022644996643, 0.4024786949157715, 0.4023760259151459, 0.40020716190338135, 0.4008631408214569, 0.4012056887149811, 0.3977046012878418, 0.39786916971206665, 0.40157219767570496, 0.3948601186275482, 0.39832812547683716, 0.3970475494861603, 0.3976713716983795, 0.3936421275138855, 0.3942430019378662, 0.39846935868263245, 0.3866622745990753, 0.39691194891929626, 0.3920910060405731, 0.3963838815689087, 0.39102184772491455, 0.39482757449150085, 0.39341360330581665, 0.3909689486026764, 0.39342477917671204, 0.39107751846313477, 0.38519835472106934, 0.39197638630867004, 0.3880450427532196, 0.38694512844085693, 0.38853737711906433, 0.3903755843639374, 0.3873341381549835, 0.3895360231399536, 0.3814878463745117, 0.3858412504196167, 0.3856278657913208, 0.3886488080024719, 0.3823695480823517, 0.3826170563697815, 0.3833910822868347, 0.38578200340270996, 0.38219648599624634, 0.3896782100200653, 0.3843458890914917, 0.3829021453857422, 0.37872254848480225]
model_history.history['val_loss']
[0.40535616874694824, 0.3948274254798889, 0.4082051217556, 0.39829161763191223, 0.39419999718666077, 0.40953198075294495, 0.39598217606544495, 0.3951069116592407, 0.39632049202919006, 0.39559921622276306, 0.395935982465744, 0.38972020149230957, 0.39144089818000793, 0.3916804790496826, 0.3903910517692566, 0.4021134674549103, 0.3934386074542999, 0.3974713683128357, 0.39747804403305054, 0.39503487944602966, 0.39070501923561096, 0.3929044008255005, 0.397475928068161, 0.39935338497161865, 0.4008231461048126, 0.3999277353286743, 0.40143662691116333, 0.4033142924308777, 0.3942776620388031, 0.4029252231121063, 0.4018080234527588, 0.4062395691871643, 0.40027502179145813, 0.4083457589149475, 0.40662145614624023, 0.41092896461486816, 0.4145979583263397, 0.40669071674346924, 0.40952977538108826, 0.40616244077682495, 0.4131413698196411, 0.4092729389667511, 0.4034976065158844, 0.41512757539749146, 0.4119105339050293, 0.40655964612960815, 0.408929705619812, 0.41145050525665283, 0.41087982058525085, 0.41411158442497253, 0.4156970977783203, 0.407317578792572, 0.4160541296005249, 0.40773844718933105, 0.4103267192840576, 0.4156971871852875, 0.4056411683559418, 0.41510361433029175, 0.40651416778564453, 0.41778460144996643, 0.4104567766189575, 0.41644802689552307, 0.4245200455188751, 0.41403549909591675, 0.41373562812805176, 0.4250999987125397, 0.42323631048202515, 0.42887187004089355, 0.4240097105503082, 0.42733845114707947, 0.4254225194454193, 0.4203762710094452, 0.4321698844432831, 0.4299393892288208, 0.42370253801345825, 0.4285167455673218, 0.4332222044467926, 0.43463361263275146, 0.4263263940811157, 0.4288237690925598, 0.42554783821105957, 0.44043630361557007, 0.44226983189582825, 0.42672276496887207, 0.4223831295967102, 0.43828094005584717, 0.4376813769340515, 0.43595030903816223, 0.4439915716648102, 0.4518167972564697, 0.4379619061946869, 0.44754624366760254, 0.4432525038719177, 0.4454522132873535, 0.45309847593307495, 0.44010117650032043, 0.43467891216278076, 0.44175300002098083]
fig_2 = go.Figure()
fig_2.add_trace(go.Scatter(x =np.arange(0,len(model_history.history['loss'])),
y = model_history.history['loss'],
mode='lines+markers',
name='loss'))
fig_2.add_trace(go.Scatter(x =np.arange(0,len(model_history.history['loss'])),
y = model_history.history['val_loss'],
mode='lines+markers',
name='val_loss'))
fig_2.update_layout(title = 'LOSS vs VALIDATION_LOSS')
fig_2.update_xaxes(title_text="Epochs")
fig_2.update_yaxes(title_text="Loss")
fig_2.show()
ANN_model.evaluate(x_test, y_test)
45/45 [==============================] - 0s 3ms/step - loss: 0.4346 - accuracy: 0.8055
[0.4346335530281067, 0.8055358529090881]
from sklearn.metrics import confusion_matrix, classification_report
predict = ANN_model.predict(x_test)
45/45 [==============================] - 0s 1ms/step
predict
array([[9.7893542e-05],
[7.0410240e-01],
[6.3315587e-04],
...,
[8.2363045e-01],
[5.0050294e-01],
[5.0984734e-01]], dtype=float32)
predict_new = []
for x in predict:
if x >= 0.5:
predict_new.append(1)
else:
predict_new.append(0)
predict_new[-10 : ]
[0, 0, 0, 0, 0, 0, 1, 1, 1, 1]
plx.imshow(confusion_matrix( y_test, predict_new), text_auto = True)
print(classification_report(y_test, predict_new))
precision recall f1-score support
0.0 0.85 0.90 0.87 1062
1.0 0.63 0.52 0.57 347
accuracy 0.81 1409
macro avg 0.74 0.71 0.72 1409
weighted avg 0.80 0.81 0.80 1409
plx.imshow(df.corr(), height = 1700, width = 1700, text_auto = True)
corr = df.corr()
df.corr()
| SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | OnlineSecurity | OnlineBackup | DeviceProtection | TechSupport | StreamingTV | StreamingMovies | PaperlessBilling | MonthlyCharges | TotalCharges | Churn | gender_Male | InternetService_DSL | InternetService_Fiber optic | InternetService_No | Contract_Month-to-month | Contract_One year | Contract_Two year | PaymentMethod_Bank transfer (automatic) | PaymentMethod_Credit card (automatic) | PaymentMethod_Electronic check | PaymentMethod_Mailed check | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SeniorCitizen | 1.000000 | 0.016479 | -0.211185 | 0.016567 | 0.008576 | 0.142948 | -0.038653 | 0.066572 | 0.059428 | -0.060625 | 0.105378 | 0.120176 | 0.156530 | 0.220173 | 0.102994 | 0.150889 | -0.001874 | -0.108322 | 0.255338 | -0.182742 | 0.138360 | -0.046262 | -0.117000 | -0.016159 | -0.024135 | 0.171718 | -0.153477 |
| Partner | 0.016479 | 1.000000 | 0.452676 | 0.379697 | 0.017706 | 0.142057 | 0.143106 | 0.141498 | 0.153786 | 0.119999 | 0.124666 | 0.117412 | -0.014877 | 0.096848 | 0.317540 | -0.150448 | -0.001808 | -0.000851 | 0.000304 | 0.000615 | -0.280865 | 0.082783 | 0.248091 | 0.110706 | 0.082029 | -0.083852 | -0.095125 |
| Dependents | -0.211185 | 0.452676 | 1.000000 | 0.159712 | -0.001762 | -0.024526 | 0.080972 | 0.023671 | 0.013963 | 0.063268 | -0.016558 | -0.039741 | -0.111377 | -0.113890 | 0.062136 | -0.164221 | 0.010517 | 0.052010 | -0.165818 | 0.139812 | -0.231720 | 0.068368 | 0.204613 | 0.052021 | 0.060267 | -0.150642 | 0.059071 |
| tenure | 0.016567 | 0.379697 | 0.159712 | 1.000000 | 0.008448 | 0.331941 | 0.327203 | 0.360277 | 0.360653 | 0.324221 | 0.279756 | 0.286111 | 0.006152 | 0.247900 | 0.826160 | -0.352229 | 0.005106 | 0.013274 | 0.019720 | -0.039062 | -0.645561 | 0.202570 | 0.558533 | 0.243510 | 0.233006 | -0.208363 | -0.233852 |
| PhoneService | 0.008576 | 0.017706 | -0.001762 | 0.008448 | 1.000000 | 0.279690 | -0.092893 | -0.052312 | -0.071227 | -0.096340 | -0.022574 | -0.032959 | 0.016505 | 0.247398 | 0.113207 | 0.011942 | -0.006488 | -0.452425 | 0.289999 | 0.172209 | -0.000742 | -0.002791 | 0.003519 | 0.007556 | -0.007721 | 0.003062 | -0.003319 |
| MultipleLines | 0.142948 | 0.142057 | -0.024526 | 0.331941 | 0.279690 | 1.000000 | 0.098108 | 0.202237 | 0.201137 | 0.100571 | 0.257152 | 0.258751 | 0.163530 | 0.490434 | 0.468516 | 0.040102 | -0.008414 | -0.199920 | 0.366083 | -0.210564 | -0.088203 | -0.003794 | 0.106253 | 0.075527 | 0.060048 | 0.083618 | -0.227206 |
| OnlineSecurity | -0.038653 | 0.143106 | 0.080972 | 0.327203 | -0.092893 | 0.098108 | 1.000000 | 0.283832 | 0.275438 | 0.354931 | 0.176207 | 0.187398 | -0.003636 | 0.296594 | 0.411672 | -0.171226 | -0.017021 | 0.321269 | -0.030696 | -0.333403 | -0.246679 | 0.100162 | 0.191773 | 0.095158 | 0.115721 | -0.112338 | -0.080798 |
| OnlineBackup | 0.066572 | 0.141498 | 0.023671 | 0.360277 | -0.052312 | 0.202237 | 0.283832 | 1.000000 | 0.303546 | 0.294233 | 0.282106 | 0.274501 | 0.126735 | 0.441780 | 0.509246 | -0.082255 | -0.013773 | 0.157884 | 0.165651 | -0.381593 | -0.164172 | 0.083722 | 0.111400 | 0.087004 | 0.090785 | -0.000408 | -0.174164 |
| DeviceProtection | 0.059428 | 0.153786 | 0.013963 | 0.360653 | -0.071227 | 0.201137 | 0.275438 | 0.303546 | 1.000000 | 0.333313 | 0.390874 | 0.402111 | 0.103797 | 0.482692 | 0.522003 | -0.066160 | -0.002105 | 0.146291 | 0.176049 | -0.380754 | -0.225662 | 0.102495 | 0.165096 | 0.083115 | 0.111554 | -0.003351 | -0.187373 |
| TechSupport | -0.060625 | 0.119999 | 0.063268 | 0.324221 | -0.096340 | 0.100571 | 0.354931 | 0.294233 | 0.333313 | 1.000000 | 0.278070 | 0.279358 | 0.037880 | 0.338304 | 0.431904 | -0.164674 | -0.009212 | 0.313118 | -0.020492 | -0.336298 | -0.285241 | 0.095775 | 0.240824 | 0.101252 | 0.117272 | -0.114839 | -0.085509 |
| StreamingTV | 0.105378 | 0.124666 | -0.016558 | 0.279756 | -0.022574 | 0.257152 | 0.176207 | 0.282106 | 0.390874 | 0.278070 | 1.000000 | 0.533094 | 0.223841 | 0.629603 | 0.514990 | 0.063228 | -0.008393 | 0.016274 | 0.329349 | -0.415552 | -0.112282 | 0.061612 | 0.072049 | 0.046252 | 0.040433 | 0.144626 | -0.247742 |
| StreamingMovies | 0.120176 | 0.117412 | -0.039741 | 0.286111 | -0.032959 | 0.258751 | 0.187398 | 0.274501 | 0.402111 | 0.279358 | 0.533094 | 1.000000 | 0.211716 | 0.627429 | 0.520118 | 0.061382 | -0.010487 | 0.025698 | 0.322923 | -0.418675 | -0.116633 | 0.064926 | 0.073960 | 0.048652 | 0.048575 | 0.137966 | -0.250595 |
| PaperlessBilling | 0.156530 | -0.014877 | -0.111377 | 0.006152 | 0.016505 | 0.163530 | -0.003636 | 0.126735 | 0.103797 | 0.037880 | 0.223841 | 0.211716 | 1.000000 | 0.352150 | 0.158557 | 0.191825 | -0.011754 | -0.063121 | 0.326853 | -0.321013 | 0.169096 | -0.051391 | -0.147889 | -0.016332 | -0.013589 | 0.208865 | -0.205398 |
| MonthlyCharges | 0.220173 | 0.096848 | -0.113890 | 0.247900 | 0.247398 | 0.490434 | 0.296594 | 0.441780 | 0.482692 | 0.338304 | 0.629603 | 0.627429 | 0.352150 | 1.000000 | 0.651169 | 0.193356 | -0.014569 | -0.160189 | 0.787066 | -0.763557 | 0.060165 | 0.004904 | -0.074681 | 0.042812 | 0.030550 | 0.271625 | -0.377437 |
| TotalCharges | 0.102994 | 0.317540 | 0.062136 | 0.826160 | 0.113207 | 0.468516 | 0.411672 | 0.509246 | 0.522003 | 0.431904 | 0.514990 | 0.520118 | 0.158557 | 0.651169 | 1.000000 | -0.198353 | -0.000077 | -0.052462 | 0.361636 | -0.375207 | -0.444311 | 0.170810 | 0.354550 | 0.185990 | 0.182910 | -0.059274 | -0.295726 |
| Churn | 0.150889 | -0.150448 | -0.164221 | -0.352229 | 0.011942 | 0.040102 | -0.171226 | -0.082255 | -0.066160 | -0.164674 | 0.063228 | 0.061382 | 0.191825 | 0.193356 | -0.198353 | 1.000000 | -0.008612 | -0.124214 | 0.308020 | -0.227890 | 0.405103 | -0.177820 | -0.302253 | -0.117937 | -0.134302 | 0.301919 | -0.091683 |
| gender_Male | -0.001874 | -0.001808 | 0.010517 | 0.005106 | -0.006488 | -0.008414 | -0.017021 | -0.013773 | -0.002105 | -0.009212 | -0.008393 | -0.010487 | -0.011754 | -0.014569 | -0.000077 | -0.008612 | 1.000000 | 0.006568 | -0.011286 | 0.006026 | -0.003386 | 0.008026 | -0.003695 | -0.016024 | 0.001215 | 0.000752 | 0.013744 |
| InternetService_DSL | -0.108322 | -0.000851 | 0.052010 | 0.013274 | -0.452425 | -0.199920 | 0.321269 | 0.157884 | 0.146291 | 0.313118 | 0.016274 | 0.025698 | -0.063121 | -0.160189 | -0.052462 | -0.124214 | 0.006568 | 1.000000 | -0.640987 | -0.380635 | -0.065509 | 0.046795 | 0.031714 | 0.025476 | 0.051438 | -0.104418 | 0.041899 |
| InternetService_Fiber optic | 0.255338 | 0.000304 | -0.165818 | 0.019720 | 0.289999 | 0.366083 | -0.030696 | 0.165651 | 0.176049 | -0.020492 | 0.329349 | 0.322923 | 0.326853 | 0.787066 | 0.361636 | 0.308020 | -0.011286 | -0.640987 | 1.000000 | -0.465793 | 0.244164 | -0.076324 | -0.211526 | -0.022624 | -0.050077 | 0.336410 | -0.306834 |
| InternetService_No | -0.182742 | 0.000615 | 0.139812 | -0.039062 | 0.172209 | -0.210564 | -0.333403 | -0.381593 | -0.380754 | -0.336298 | -0.415552 | -0.418675 | -0.321013 | -0.763557 | -0.375207 | -0.227890 | 0.006026 | -0.380635 | -0.465793 | 1.000000 | -0.218639 | 0.038004 | 0.218278 | -0.002113 | 0.001030 | -0.284917 | 0.321361 |
| Contract_Month-to-month | 0.138360 | -0.280865 | -0.231720 | -0.645561 | -0.000742 | -0.088203 | -0.246679 | -0.164172 | -0.225662 | -0.285241 | -0.112282 | -0.116633 | 0.169096 | 0.060165 | -0.444311 | 0.405103 | -0.003386 | -0.065509 | 0.244164 | -0.218639 | 1.000000 | -0.568744 | -0.622633 | -0.179707 | -0.204145 | 0.331661 | 0.004138 |
| Contract_One year | -0.046262 | 0.082783 | 0.068368 | 0.202570 | -0.002791 | -0.003794 | 0.100162 | 0.083722 | 0.102495 | 0.095775 | 0.061612 | 0.064926 | -0.051391 | 0.004904 | 0.170810 | -0.177820 | 0.008026 | 0.046795 | -0.076324 | 0.038004 | -0.568744 | 1.000000 | -0.289510 | 0.057451 | 0.067589 | -0.109130 | -0.000116 |
| Contract_Two year | -0.117000 | 0.248091 | 0.204613 | 0.558533 | 0.003519 | 0.106253 | 0.191773 | 0.111400 | 0.165096 | 0.240824 | 0.072049 | 0.073960 | -0.147889 | -0.074681 | 0.354550 | -0.302253 | -0.003695 | 0.031714 | -0.211526 | 0.218278 | -0.622633 | -0.289510 | 1.000000 | 0.154471 | 0.173265 | -0.282138 | -0.004705 |
| PaymentMethod_Bank transfer (automatic) | -0.016159 | 0.110706 | 0.052021 | 0.243510 | 0.007556 | 0.075527 | 0.095158 | 0.087004 | 0.083115 | 0.101252 | 0.046252 | 0.048652 | -0.016332 | 0.042812 | 0.185990 | -0.117937 | -0.016024 | 0.025476 | -0.022624 | -0.002113 | -0.179707 | 0.057451 | 0.154471 | 1.000000 | -0.278215 | -0.376762 | -0.288685 |
| PaymentMethod_Credit card (automatic) | -0.024135 | 0.082029 | 0.060267 | 0.233006 | -0.007721 | 0.060048 | 0.115721 | 0.090785 | 0.111554 | 0.117272 | 0.040433 | 0.048575 | -0.013589 | 0.030550 | 0.182910 | -0.134302 | 0.001215 | 0.051438 | -0.050077 | 0.001030 | -0.204145 | 0.067589 | 0.173265 | -0.278215 | 1.000000 | -0.373322 | -0.286049 |
| PaymentMethod_Electronic check | 0.171718 | -0.083852 | -0.150642 | -0.208363 | 0.003062 | 0.083618 | -0.112338 | -0.000408 | -0.003351 | -0.114839 | 0.144626 | 0.137966 | 0.208865 | 0.271625 | -0.059274 | 0.301919 | 0.000752 | -0.104418 | 0.336410 | -0.284917 | 0.331661 | -0.109130 | -0.282138 | -0.376762 | -0.373322 | 1.000000 | -0.387372 |
| PaymentMethod_Mailed check | -0.153477 | -0.095125 | 0.059071 | -0.233852 | -0.003319 | -0.227206 | -0.080798 | -0.174164 | -0.187373 | -0.085509 | -0.247742 | -0.250595 | -0.205398 | -0.377437 | -0.295726 | -0.091683 | 0.013744 | 0.041899 | -0.306834 | 0.321361 | 0.004138 | -0.000116 | -0.004705 | -0.288685 | -0.286049 | -0.387372 | 1.000000 |
index = corr['Churn'].index
values = corr['Churn'].values
values = [abs(x) for x in values]
sort_df = pd.DataFrame()
sort_df['index'] , sort_df['values'] = list(index), list(values)
sort_df
| index | values | |
|---|---|---|
| 0 | SeniorCitizen | 0.150889 |
| 1 | Partner | 0.150448 |
| 2 | Dependents | 0.164221 |
| 3 | tenure | 0.352229 |
| 4 | PhoneService | 0.011942 |
| 5 | MultipleLines | 0.040102 |
| 6 | OnlineSecurity | 0.171226 |
| 7 | OnlineBackup | 0.082255 |
| 8 | DeviceProtection | 0.066160 |
| 9 | TechSupport | 0.164674 |
| 10 | StreamingTV | 0.063228 |
| 11 | StreamingMovies | 0.061382 |
| 12 | PaperlessBilling | 0.191825 |
| 13 | MonthlyCharges | 0.193356 |
| 14 | TotalCharges | 0.198353 |
| 15 | Churn | 1.000000 |
| 16 | gender_Male | 0.008612 |
| 17 | InternetService_DSL | 0.124214 |
| 18 | InternetService_Fiber optic | 0.308020 |
| 19 | InternetService_No | 0.227890 |
| 20 | Contract_Month-to-month | 0.405103 |
| 21 | Contract_One year | 0.177820 |
| 22 | Contract_Two year | 0.302253 |
| 23 | PaymentMethod_Bank transfer (automatic) | 0.117937 |
| 24 | PaymentMethod_Credit card (automatic) | 0.134302 |
| 25 | PaymentMethod_Electronic check | 0.301919 |
| 26 | PaymentMethod_Mailed check | 0.091683 |
sort_df = sort_df.sort_values(by=['values'])
# sort_df is sorted by the column 'values'
sort_df.reset_index(inplace = True)
sort_df
| level_0 | index | values | |
|---|---|---|---|
| 0 | 16 | gender_Male | 0.008612 |
| 1 | 4 | PhoneService | 0.011942 |
| 2 | 5 | MultipleLines | 0.040102 |
| 3 | 11 | StreamingMovies | 0.061382 |
| 4 | 10 | StreamingTV | 0.063228 |
| 5 | 8 | DeviceProtection | 0.066160 |
| 6 | 7 | OnlineBackup | 0.082255 |
| 7 | 26 | PaymentMethod_Mailed check | 0.091683 |
| 8 | 23 | PaymentMethod_Bank transfer (automatic) | 0.117937 |
| 9 | 17 | InternetService_DSL | 0.124214 |
| 10 | 24 | PaymentMethod_Credit card (automatic) | 0.134302 |
| 11 | 1 | Partner | 0.150448 |
| 12 | 0 | SeniorCitizen | 0.150889 |
| 13 | 2 | Dependents | 0.164221 |
| 14 | 9 | TechSupport | 0.164674 |
| 15 | 6 | OnlineSecurity | 0.171226 |
| 16 | 21 | Contract_One year | 0.177820 |
| 17 | 12 | PaperlessBilling | 0.191825 |
| 18 | 13 | MonthlyCharges | 0.193356 |
| 19 | 14 | TotalCharges | 0.198353 |
| 20 | 19 | InternetService_No | 0.227890 |
| 21 | 25 | PaymentMethod_Electronic check | 0.301919 |
| 22 | 22 | Contract_Two year | 0.302253 |
| 23 | 18 | InternetService_Fiber optic | 0.308020 |
| 24 | 3 | tenure | 0.352229 |
| 25 | 20 | Contract_Month-to-month | 0.405103 |
| 26 | 15 | Churn | 1.000000 |
to_remove_features = list(sort_df['index'][0 : 17])
to_remove_features
['gender_Male', 'PhoneService', 'MultipleLines', 'StreamingMovies', 'StreamingTV', 'DeviceProtection', 'OnlineBackup', 'PaymentMethod_Mailed check', 'PaymentMethod_Bank transfer (automatic)', 'InternetService_DSL', 'PaymentMethod_Credit card (automatic)', 'Partner', 'SeniorCitizen', 'Dependents', 'TechSupport', 'OnlineSecurity', 'Contract_One year']
These are the 17 least important features of the dataset
df_old = df
df = df.drop(to_remove_features, axis = 1)
df
| tenure | PaperlessBilling | MonthlyCharges | TotalCharges | Churn | InternetService_Fiber optic | InternetService_No | Contract_Month-to-month | Contract_Two year | PaymentMethod_Electronic check | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.013889 | 1.0 | 0.115423 | 0.001275 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 |
| 1 | 0.472222 | 0.0 | 0.385075 | 0.215867 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2 | 0.027778 | 1.0 | 0.354229 | 0.010310 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 |
| 3 | 0.625000 | 0.0 | 0.239303 | 0.210241 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 4 | 0.027778 | 1.0 | 0.521891 | 0.015330 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 1.0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 7038 | 0.333333 | 1.0 | 0.662189 | 0.227521 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 7039 | 1.000000 | 1.0 | 0.845274 | 0.847461 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 7040 | 0.152778 | 1.0 | 0.112935 | 0.037809 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 |
| 7041 | 0.055556 | 1.0 | 0.558706 | 0.033210 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 |
| 7042 | 0.916667 | 1.0 | 0.869652 | 0.787641 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 |
7043 rows × 10 columns
x_train = x_train.drop(to_remove_features, axis = 1)
x_test = x_test.drop(to_remove_features, axis = 1)
ANN_model_2 = Sequential()
# Adding Input Layer to ANN
ANN_model_2.add(Dense(units = 9, activation = 'relu'))
# Adding 1st Hidden Layer to the ANN
ANN_model_2.add(Dense(units = 7, activation = 'relu'))
#ANN_model_2.add(Dropout(0.3))
# Adding 2nd Hidden Layer to the ANN
ANN_model_2.add(Dense(units = 3, activation = 'relu'))
#ANN_model_2.add(Dropout(0.3))
# Adding Output Layer to the ANN
ANN_model_2.add(Dense(units = 1, activation = 'sigmoid'))
# Compiling the ANN model with required parameters
ANN_model_2.compile(optimizer = 'adam',
loss = 'binary_crossentropy',
metrics = ['accuracy'])
# Early Stopping is provided to avoid running too many epochs of no improvement.
early_stopping = tf.keras.callbacks.EarlyStopping(
monitor="accuracy",
min_delta=0.0001,
patience=20,
verbose=1,
mode="auto",
baseline=None,
restore_best_weights=True
)
model_history_2 = ANN_model_2.fit(x_train, y_train, batch_size = 2, epochs = 150, validation_data = (x_test,y_test), callbacks = early_stopping )
Epoch 1/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4722 - accuracy: 0.7575 - val_loss: 0.4139 - val_accuracy: 0.8041 Epoch 2/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4427 - accuracy: 0.7904 - val_loss: 0.4095 - val_accuracy: 0.8062 Epoch 3/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4382 - accuracy: 0.7932 - val_loss: 0.4044 - val_accuracy: 0.8034 Epoch 4/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4357 - accuracy: 0.7946 - val_loss: 0.4049 - val_accuracy: 0.8020 Epoch 5/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4345 - accuracy: 0.7946 - val_loss: 0.4018 - val_accuracy: 0.8112 Epoch 6/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4334 - accuracy: 0.7948 - val_loss: 0.4033 - val_accuracy: 0.8034 Epoch 7/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4320 - accuracy: 0.7936 - val_loss: 0.3998 - val_accuracy: 0.8048 Epoch 8/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4309 - accuracy: 0.7964 - val_loss: 0.3993 - val_accuracy: 0.8077 Epoch 9/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4306 - accuracy: 0.7948 - val_loss: 0.3987 - val_accuracy: 0.8105 Epoch 10/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4308 - accuracy: 0.7934 - val_loss: 0.4002 - val_accuracy: 0.8062 Epoch 11/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4295 - accuracy: 0.7932 - val_loss: 0.4020 - val_accuracy: 0.8013 Epoch 12/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4295 - accuracy: 0.7982 - val_loss: 0.3966 - val_accuracy: 0.8091 Epoch 13/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4289 - accuracy: 0.7955 - val_loss: 0.3965 - val_accuracy: 0.8055 Epoch 14/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4290 - accuracy: 0.7991 - val_loss: 0.3985 - val_accuracy: 0.8048 Epoch 15/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4290 - accuracy: 0.7964 - val_loss: 0.3982 - val_accuracy: 0.8077 Epoch 16/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4284 - accuracy: 0.7922 - val_loss: 0.3966 - val_accuracy: 0.8062 Epoch 17/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4285 - accuracy: 0.7975 - val_loss: 0.3968 - val_accuracy: 0.8070 Epoch 18/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4272 - accuracy: 0.7966 - val_loss: 0.3964 - val_accuracy: 0.8084 Epoch 19/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4274 - accuracy: 0.7959 - val_loss: 0.3965 - val_accuracy: 0.8112 Epoch 20/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4265 - accuracy: 0.7977 - val_loss: 0.3976 - val_accuracy: 0.8062 Epoch 21/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4270 - accuracy: 0.7943 - val_loss: 0.3940 - val_accuracy: 0.8098 Epoch 22/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4279 - accuracy: 0.7977 - val_loss: 0.3947 - val_accuracy: 0.8055 Epoch 23/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4269 - accuracy: 0.7943 - val_loss: 0.3952 - val_accuracy: 0.8133 Epoch 24/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4262 - accuracy: 0.7964 - val_loss: 0.3928 - val_accuracy: 0.8091 Epoch 25/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4258 - accuracy: 0.7980 - val_loss: 0.3928 - val_accuracy: 0.8148 Epoch 26/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4257 - accuracy: 0.7994 - val_loss: 0.4012 - val_accuracy: 0.8048 Epoch 27/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4254 - accuracy: 0.7975 - val_loss: 0.3969 - val_accuracy: 0.8105 Epoch 28/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4245 - accuracy: 0.7975 - val_loss: 0.3953 - val_accuracy: 0.8062 Epoch 29/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4260 - accuracy: 0.7985 - val_loss: 0.3965 - val_accuracy: 0.8098 Epoch 30/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4248 - accuracy: 0.7977 - val_loss: 0.3962 - val_accuracy: 0.8112 Epoch 31/150 2817/2817 [==============================] - 10s 4ms/step - loss: 0.4256 - accuracy: 0.7957 - val_loss: 0.3965 - val_accuracy: 0.8112 Epoch 32/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4252 - accuracy: 0.7969 - val_loss: 0.3926 - val_accuracy: 0.8148 Epoch 33/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4254 - accuracy: 0.7964 - val_loss: 0.3959 - val_accuracy: 0.8148 Epoch 34/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4238 - accuracy: 0.7985 - val_loss: 0.3933 - val_accuracy: 0.8133 Epoch 35/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4248 - accuracy: 0.7964 - val_loss: 0.3929 - val_accuracy: 0.8183 Epoch 36/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4246 - accuracy: 0.7962 - val_loss: 0.3974 - val_accuracy: 0.8148 Epoch 37/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4235 - accuracy: 0.7966 - val_loss: 0.3948 - val_accuracy: 0.8105 Epoch 38/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4250 - accuracy: 0.7987 - val_loss: 0.3984 - val_accuracy: 0.8105 Epoch 39/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4244 - accuracy: 0.7978 - val_loss: 0.3959 - val_accuracy: 0.8190 Epoch 40/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4236 - accuracy: 0.7964 - val_loss: 0.3929 - val_accuracy: 0.8141 Epoch 41/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4242 - accuracy: 0.7969 - val_loss: 0.3964 - val_accuracy: 0.8105 Epoch 42/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4237 - accuracy: 0.8016 - val_loss: 0.3938 - val_accuracy: 0.8098 Epoch 43/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4239 - accuracy: 0.7994 - val_loss: 0.3959 - val_accuracy: 0.8077 Epoch 44/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4234 - accuracy: 0.7991 - val_loss: 0.3958 - val_accuracy: 0.8126 Epoch 45/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4237 - accuracy: 0.7987 - val_loss: 0.3953 - val_accuracy: 0.8141 Epoch 46/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4230 - accuracy: 0.7998 - val_loss: 0.3951 - val_accuracy: 0.8126 Epoch 47/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4230 - accuracy: 0.7977 - val_loss: 0.3942 - val_accuracy: 0.8098 Epoch 48/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4227 - accuracy: 0.7996 - val_loss: 0.3940 - val_accuracy: 0.8098 Epoch 49/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4227 - accuracy: 0.8001 - val_loss: 0.3942 - val_accuracy: 0.8112 Epoch 50/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4226 - accuracy: 0.7978 - val_loss: 0.3931 - val_accuracy: 0.8098 Epoch 51/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4238 - accuracy: 0.7989 - val_loss: 0.3928 - val_accuracy: 0.8119 Epoch 52/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4225 - accuracy: 0.7978 - val_loss: 0.3981 - val_accuracy: 0.8098 Epoch 53/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4238 - accuracy: 0.7994 - val_loss: 0.3957 - val_accuracy: 0.8141 Epoch 54/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4228 - accuracy: 0.7993 - val_loss: 0.3958 - val_accuracy: 0.8119 Epoch 55/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4216 - accuracy: 0.7968 - val_loss: 0.3948 - val_accuracy: 0.8126 Epoch 56/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4224 - accuracy: 0.8016 - val_loss: 0.3956 - val_accuracy: 0.8077 Epoch 57/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4227 - accuracy: 0.8017 - val_loss: 0.4001 - val_accuracy: 0.8084 Epoch 58/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4222 - accuracy: 0.7977 - val_loss: 0.3957 - val_accuracy: 0.8133 Epoch 59/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4224 - accuracy: 0.8005 - val_loss: 0.3968 - val_accuracy: 0.8070 Epoch 60/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4213 - accuracy: 0.7966 - val_loss: 0.3966 - val_accuracy: 0.8105 Epoch 61/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4216 - accuracy: 0.7991 - val_loss: 0.3990 - val_accuracy: 0.8034 Epoch 62/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4226 - accuracy: 0.8019 - val_loss: 0.3947 - val_accuracy: 0.8119 Epoch 63/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4225 - accuracy: 0.7980 - val_loss: 0.3970 - val_accuracy: 0.8105 Epoch 64/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4228 - accuracy: 0.7996 - val_loss: 0.3923 - val_accuracy: 0.8084 Epoch 65/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4222 - accuracy: 0.7991 - val_loss: 0.3966 - val_accuracy: 0.8112 Epoch 66/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4216 - accuracy: 0.8000 - val_loss: 0.3965 - val_accuracy: 0.8112 Epoch 67/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4218 - accuracy: 0.7978 - val_loss: 0.3963 - val_accuracy: 0.8133 Epoch 68/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4218 - accuracy: 0.8007 - val_loss: 0.3956 - val_accuracy: 0.8112 Epoch 69/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4214 - accuracy: 0.7980 - val_loss: 0.3948 - val_accuracy: 0.8133 Epoch 70/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4224 - accuracy: 0.8005 - val_loss: 0.3942 - val_accuracy: 0.8119 Epoch 71/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4219 - accuracy: 0.8026 - val_loss: 0.3967 - val_accuracy: 0.8077 Epoch 72/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4218 - accuracy: 0.7994 - val_loss: 0.3945 - val_accuracy: 0.8133 Epoch 73/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4212 - accuracy: 0.8007 - val_loss: 0.4045 - val_accuracy: 0.8048 Epoch 74/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4217 - accuracy: 0.8009 - val_loss: 0.3944 - val_accuracy: 0.8119 Epoch 75/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4219 - accuracy: 0.7984 - val_loss: 0.3930 - val_accuracy: 0.8119 Epoch 76/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4220 - accuracy: 0.8007 - val_loss: 0.3934 - val_accuracy: 0.8084 Epoch 77/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4210 - accuracy: 0.7980 - val_loss: 0.3951 - val_accuracy: 0.8126 Epoch 78/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4219 - accuracy: 0.7991 - val_loss: 0.3932 - val_accuracy: 0.8077 Epoch 79/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4220 - accuracy: 0.7989 - val_loss: 0.3956 - val_accuracy: 0.8133 Epoch 80/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4216 - accuracy: 0.8009 - val_loss: 0.3968 - val_accuracy: 0.8098 Epoch 81/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4215 - accuracy: 0.7993 - val_loss: 0.3945 - val_accuracy: 0.8148 Epoch 82/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4209 - accuracy: 0.8021 - val_loss: 0.3986 - val_accuracy: 0.8091 Epoch 83/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4217 - accuracy: 0.7980 - val_loss: 0.3937 - val_accuracy: 0.8141 Epoch 84/150 2817/2817 [==============================] - 12s 4ms/step - loss: 0.4208 - accuracy: 0.7984 - val_loss: 0.3963 - val_accuracy: 0.8148 Epoch 85/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4206 - accuracy: 0.7987 - val_loss: 0.3934 - val_accuracy: 0.8141 Epoch 86/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4206 - accuracy: 0.7998 - val_loss: 0.3937 - val_accuracy: 0.8162 Epoch 87/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4211 - accuracy: 0.7991 - val_loss: 0.3950 - val_accuracy: 0.8112 Epoch 88/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4204 - accuracy: 0.7991 - val_loss: 0.3925 - val_accuracy: 0.8112 Epoch 89/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4205 - accuracy: 0.8003 - val_loss: 0.3960 - val_accuracy: 0.8077 Epoch 90/150 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4200 - accuracy: 0.7989 - val_loss: 0.3917 - val_accuracy: 0.8155 Epoch 91/150 2809/2817 [============================>.] - ETA: 0s - loss: 0.4199 - accuracy: 0.8015Restoring model weights from the end of the best epoch: 71. 2817/2817 [==============================] - 11s 4ms/step - loss: 0.4204 - accuracy: 0.8012 - val_loss: 0.3932 - val_accuracy: 0.8148 Epoch 91: early stopping
fig_3 = go.Figure()
fig_3.add_trace(go.Scatter(x =np.arange(0,len(model_history_2.history['accuracy'])),
y = model_history_2.history['val_accuracy'],
mode='lines+markers',
name='val_accuracy'))
fig_3.add_trace(go.Scatter(x =np.arange(0,len(model_history_2.history['accuracy'])),
y = model_history_2.history['accuracy'],
mode='lines+markers',
name='Accuracy'))
fig_3.update_layout(title = 'ACCURACY vs VALIDATION_ACCURACY')
fig_3.update_xaxes(title_text="Epochs")
fig_3.update_yaxes(title_text="Accuracy")
fig_3.show()
fig_4 = go.Figure()
fig_4.add_trace(go.Scatter(x =np.arange(0,len(model_history_2.history['loss'])),
y = model_history_2.history['loss'],
mode='lines+markers',
name='loss'))
fig_4.add_trace(go.Scatter(x =np.arange(0,len(model_history_2.history['loss'])),
y = model_history_2.history['val_loss'],
mode='lines+markers',
name='val_loss'))
fig_4.update_layout(title = 'LOSS vs VALIDATION_LOSS')
fig_4.update_xaxes(title_text="Epochs")
fig_4.update_yaxes(title_text="Loss")
fig_4.show()
ANN_model_2.evaluate(x_test, y_test)
45/45 [==============================] - 0s 4ms/step - loss: 0.3967 - accuracy: 0.8077
[0.3967050015926361, 0.8076649904251099]
predict_2 = ANN_model_2.predict(x_test)
45/45 [==============================] - 0s 2ms/step
predict_2
array([[0.00228695],
[0.09408851],
[0.00781803],
...,
[0.36911735],
[0.67390597],
[0.7055226 ]], dtype=float32)
# Converting those predicted probabilities into Binary output
predict_new_2 = []
for x in predict_2:
if x >= 0.5:
predict_new_2.append(1)
else:
predict_new_2.append(0)
predict_new_2[-10 : ]
[0, 0, 0, 0, 0, 0, 0, 0, 1, 1]
plx.imshow(confusion_matrix( y_test, predict_new_2), text_auto = True)
print(classification_report(y_test, predict_new_2))
precision recall f1-score support
0.0 0.84 0.91 0.88 1062
1.0 0.65 0.48 0.55 347
accuracy 0.81 1409
macro avg 0.75 0.70 0.72 1409
weighted avg 0.80 0.81 0.80 1409
df
| tenure | PaperlessBilling | MonthlyCharges | TotalCharges | Churn | InternetService_Fiber optic | InternetService_No | Contract_Month-to-month | Contract_Two year | PaymentMethod_Electronic check | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.013889 | 1.0 | 0.115423 | 0.001275 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 |
| 1 | 0.472222 | 0.0 | 0.385075 | 0.215867 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2 | 0.027778 | 1.0 | 0.354229 | 0.010310 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 |
| 3 | 0.625000 | 0.0 | 0.239303 | 0.210241 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 4 | 0.027778 | 1.0 | 0.521891 | 0.015330 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 1.0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 7038 | 0.333333 | 1.0 | 0.662189 | 0.227521 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 7039 | 1.000000 | 1.0 | 0.845274 | 0.847461 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 7040 | 0.152778 | 1.0 | 0.112935 | 0.037809 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 |
| 7041 | 0.055556 | 1.0 | 0.558706 | 0.033210 | 1.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 |
| 7042 | 0.916667 | 1.0 | 0.869652 | 0.787641 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 |
7043 rows × 10 columns
fig = plx.histogram(df, x = 'Churn', title = "Churn", color = 'Churn')
fig.update_traces(dict(marker_line_width=0))
fig.show()
churn_0 = df.where(df['Churn'] == 0).dropna()
churn_1 = df.where(df['Churn'] == 1).dropna()
len(churn_0), len(churn_1)
(5174, 1869)
len(df)
7043
x_new, y_new = df_old.drop(['Churn'], axis = 1), df_old['Churn']
x_new
| SeniorCitizen | Partner | Dependents | tenure | PhoneService | MultipleLines | OnlineSecurity | OnlineBackup | DeviceProtection | TechSupport | StreamingTV | StreamingMovies | PaperlessBilling | MonthlyCharges | TotalCharges | gender_Male | InternetService_DSL | InternetService_Fiber optic | InternetService_No | Contract_Month-to-month | Contract_One year | Contract_Two year | PaymentMethod_Bank transfer (automatic) | PaymentMethod_Credit card (automatic) | PaymentMethod_Electronic check | PaymentMethod_Mailed check | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.0 | 1.0 | 0.0 | 0.013889 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.115423 | 0.001275 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
| 1 | 0.0 | 0.0 | 0.0 | 0.472222 | 1.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.385075 | 0.215867 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| 2 | 0.0 | 0.0 | 0.0 | 0.027778 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.354229 | 0.010310 | 1.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| 3 | 0.0 | 0.0 | 0.0 | 0.625000 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.239303 | 0.210241 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 |
| 4 | 0.0 | 0.0 | 0.0 | 0.027778 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.521891 | 0.015330 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 7038 | 0.0 | 1.0 | 1.0 | 0.333333 | 1.0 | 1.0 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.662189 | 0.227521 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| 7039 | 0.0 | 1.0 | 1.0 | 1.000000 | 1.0 | 1.0 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 0.845274 | 0.847461 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 |
| 7040 | 0.0 | 1.0 | 1.0 | 0.152778 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.112935 | 0.037809 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 |
| 7041 | 1.0 | 1.0 | 0.0 | 0.055556 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.558706 | 0.033210 | 1.0 | 0.0 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| 7042 | 0.0 | 0.0 | 0.0 | 0.916667 | 1.0 | 0.0 | 1.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.869652 | 0.787641 | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 |
7043 rows × 26 columns
y_new
0 0.0
1 0.0
2 1.0
3 0.0
4 1.0
...
7038 0.0
7039 0.0
7040 0.0
7041 1.0
7042 0.0
Name: Churn, Length: 7043, dtype: float64
#!pip install imblearn
from imblearn.over_sampling import RandomOverSampler
ros = RandomOverSampler(sampling_strategy = 'auto')
x_resampled, y_resampled = ros.fit_resample(x_new, y_new)
len(x_resampled)
10348
from collections import Counter
print(Counter(y_resampled))
Counter({0.0: 5174, 1.0: 5174})
x_train_3, x_test_3, y_train_3, y_test_3 = train_test_split( x_resampled, y_resampled, test_size = 0.2, random_state = 35)
ANN_model_3 = Sequential()
# Adding Input Layer to ANN
ANN_model_3.add(Dense(units = 27, activation = 'relu'))
# Adding 1st Hidden Layer to the ANN
ANN_model_3.add(Dense(units = 15, activation = 'relu'))
ANN_model_3.add(Dropout(0.4))
# Adding 2nd Hidden Layer to the ANN
ANN_model_3.add(Dense(units = 7, activation = 'relu'))
ANN_model_3.add(Dropout(0.3))
# Adding Output Layer to the ANN
ANN_model_3.add(Dense(units = 1, activation = 'sigmoid'))
ANN_model_3.compile(optimizer = 'adam',
loss = 'binary_crossentropy',
metrics = ['accuracy'])
early_stopping = tf.keras.callbacks.EarlyStopping(
monitor="accuracy",
min_delta=0.0001,
patience=20,
verbose=1,
mode="auto",
baseline=None,
restore_best_weights=True
)
model_history_3 = ANN_model_3.fit(x_train_3, y_train_3, batch_size = 2, epochs = 150, validation_data = (x_test_3, y_test_3), callbacks = early_stopping )
Epoch 1/150 4139/4139 [==============================] - 20s 5ms/step - loss: 0.5543 - accuracy: 0.7218 - val_loss: 0.4874 - val_accuracy: 0.7652 Epoch 2/150 4139/4139 [==============================] - 19s 4ms/step - loss: 0.5205 - accuracy: 0.7538 - val_loss: 0.4717 - val_accuracy: 0.7749 Epoch 3/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.5026 - accuracy: 0.7586 - val_loss: 0.4747 - val_accuracy: 0.7681 Epoch 4/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.5085 - accuracy: 0.7602 - val_loss: 0.4748 - val_accuracy: 0.7744 Epoch 5/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4999 - accuracy: 0.7644 - val_loss: 0.4690 - val_accuracy: 0.7797 Epoch 6/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4932 - accuracy: 0.7621 - val_loss: 0.4656 - val_accuracy: 0.7870 Epoch 7/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.4875 - accuracy: 0.7665 - val_loss: 0.4592 - val_accuracy: 0.7812 Epoch 8/150 4139/4139 [==============================] - 19s 4ms/step - loss: 0.4844 - accuracy: 0.7727 - val_loss: 0.4585 - val_accuracy: 0.7850 Epoch 9/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.4797 - accuracy: 0.7693 - val_loss: 0.4596 - val_accuracy: 0.7845 Epoch 10/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4728 - accuracy: 0.7780 - val_loss: 0.4600 - val_accuracy: 0.7836 Epoch 11/150 4139/4139 [==============================] - 19s 4ms/step - loss: 0.4680 - accuracy: 0.7770 - val_loss: 0.4744 - val_accuracy: 0.7797 Epoch 12/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4678 - accuracy: 0.7771 - val_loss: 0.4639 - val_accuracy: 0.7831 Epoch 13/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4647 - accuracy: 0.7801 - val_loss: 0.4561 - val_accuracy: 0.7899 Epoch 14/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4625 - accuracy: 0.7845 - val_loss: 0.4624 - val_accuracy: 0.7855 Epoch 15/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4631 - accuracy: 0.7873 - val_loss: 0.4637 - val_accuracy: 0.7855 Epoch 16/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.4544 - accuracy: 0.7886 - val_loss: 0.4592 - val_accuracy: 0.7865 Epoch 17/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4548 - accuracy: 0.7857 - val_loss: 0.4571 - val_accuracy: 0.7850 Epoch 18/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.4500 - accuracy: 0.7904 - val_loss: 0.4598 - val_accuracy: 0.7908 Epoch 19/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4477 - accuracy: 0.7910 - val_loss: 0.4624 - val_accuracy: 0.7778 Epoch 20/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4431 - accuracy: 0.7957 - val_loss: 0.4600 - val_accuracy: 0.7903 Epoch 21/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4465 - accuracy: 0.7903 - val_loss: 0.4544 - val_accuracy: 0.7797 Epoch 22/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.4414 - accuracy: 0.7893 - val_loss: 0.4585 - val_accuracy: 0.7870 Epoch 23/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4414 - accuracy: 0.7922 - val_loss: 0.4600 - val_accuracy: 0.7889 Epoch 24/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4439 - accuracy: 0.7973 - val_loss: 0.4569 - val_accuracy: 0.7879 Epoch 25/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.4364 - accuracy: 0.7933 - val_loss: 0.4550 - val_accuracy: 0.7913 Epoch 26/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.4352 - accuracy: 0.7986 - val_loss: 0.4548 - val_accuracy: 0.7816 Epoch 27/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.4291 - accuracy: 0.7955 - val_loss: 0.4558 - val_accuracy: 0.7836 Epoch 28/150 4139/4139 [==============================] - 19s 4ms/step - loss: 0.4321 - accuracy: 0.7942 - val_loss: 0.4559 - val_accuracy: 0.7845 Epoch 29/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4280 - accuracy: 0.7964 - val_loss: 0.4611 - val_accuracy: 0.7908 Epoch 30/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.4301 - accuracy: 0.7983 - val_loss: 0.4646 - val_accuracy: 0.7865 Epoch 31/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.4265 - accuracy: 0.7975 - val_loss: 0.4624 - val_accuracy: 0.7845 Epoch 32/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.4234 - accuracy: 0.7948 - val_loss: 0.4601 - val_accuracy: 0.7826 Epoch 33/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.4207 - accuracy: 0.8026 - val_loss: 0.4609 - val_accuracy: 0.7918 Epoch 34/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.4252 - accuracy: 0.8008 - val_loss: 0.4634 - val_accuracy: 0.7889 Epoch 35/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4194 - accuracy: 0.8010 - val_loss: 0.4630 - val_accuracy: 0.7865 Epoch 36/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.4197 - accuracy: 0.8015 - val_loss: 0.4610 - val_accuracy: 0.7850 Epoch 37/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4201 - accuracy: 0.7997 - val_loss: 0.4634 - val_accuracy: 0.7874 Epoch 38/150 4139/4139 [==============================] - 19s 4ms/step - loss: 0.4180 - accuracy: 0.8036 - val_loss: 0.4590 - val_accuracy: 0.7807 Epoch 39/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4186 - accuracy: 0.7997 - val_loss: 0.4664 - val_accuracy: 0.7870 Epoch 40/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.4172 - accuracy: 0.8026 - val_loss: 0.4662 - val_accuracy: 0.7947 Epoch 41/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4175 - accuracy: 0.8009 - val_loss: 0.4649 - val_accuracy: 0.7787 Epoch 42/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4163 - accuracy: 0.8019 - val_loss: 0.4587 - val_accuracy: 0.7889 Epoch 43/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4153 - accuracy: 0.7986 - val_loss: 0.4574 - val_accuracy: 0.7855 Epoch 44/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.4135 - accuracy: 0.8021 - val_loss: 0.4567 - val_accuracy: 0.7874 Epoch 45/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4100 - accuracy: 0.8032 - val_loss: 0.4618 - val_accuracy: 0.7768 Epoch 46/150 4139/4139 [==============================] - 19s 4ms/step - loss: 0.4108 - accuracy: 0.8014 - val_loss: 0.4570 - val_accuracy: 0.7932 Epoch 47/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.4137 - accuracy: 0.7992 - val_loss: 0.4602 - val_accuracy: 0.7928 Epoch 48/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4113 - accuracy: 0.8014 - val_loss: 0.4633 - val_accuracy: 0.7932 Epoch 49/150 4139/4139 [==============================] - 19s 4ms/step - loss: 0.4106 - accuracy: 0.8085 - val_loss: 0.4719 - val_accuracy: 0.7923 Epoch 50/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4133 - accuracy: 0.8024 - val_loss: 0.4722 - val_accuracy: 0.7879 Epoch 51/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.4072 - accuracy: 0.8065 - val_loss: 0.4618 - val_accuracy: 0.7952 Epoch 52/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4149 - accuracy: 0.7987 - val_loss: 0.4667 - val_accuracy: 0.7855 Epoch 53/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.4102 - accuracy: 0.7993 - val_loss: 0.4609 - val_accuracy: 0.7831 Epoch 54/150 4139/4139 [==============================] - 19s 4ms/step - loss: 0.4044 - accuracy: 0.8060 - val_loss: 0.4737 - val_accuracy: 0.7913 Epoch 55/150 4139/4139 [==============================] - 19s 4ms/step - loss: 0.4088 - accuracy: 0.8050 - val_loss: 0.4579 - val_accuracy: 0.7908 Epoch 56/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4105 - accuracy: 0.8070 - val_loss: 0.4626 - val_accuracy: 0.7976 Epoch 57/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4045 - accuracy: 0.8067 - val_loss: 0.4712 - val_accuracy: 0.7976 Epoch 58/150 4139/4139 [==============================] - 22s 5ms/step - loss: 0.4076 - accuracy: 0.8068 - val_loss: 0.4731 - val_accuracy: 0.7894 Epoch 59/150 4139/4139 [==============================] - 19s 4ms/step - loss: 0.4038 - accuracy: 0.8101 - val_loss: 0.4543 - val_accuracy: 0.7874 Epoch 60/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4069 - accuracy: 0.8068 - val_loss: 0.4690 - val_accuracy: 0.7923 Epoch 61/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4095 - accuracy: 0.8077 - val_loss: 0.4561 - val_accuracy: 0.7850 Epoch 62/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4022 - accuracy: 0.8054 - val_loss: 0.4714 - val_accuracy: 0.7947 Epoch 63/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4082 - accuracy: 0.8043 - val_loss: 0.4884 - val_accuracy: 0.7942 Epoch 64/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4059 - accuracy: 0.8070 - val_loss: 0.4682 - val_accuracy: 0.7870 Epoch 65/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.4043 - accuracy: 0.8079 - val_loss: 0.4684 - val_accuracy: 0.7947 Epoch 66/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.4040 - accuracy: 0.8070 - val_loss: 0.4671 - val_accuracy: 0.7971 Epoch 67/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4008 - accuracy: 0.8085 - val_loss: 0.4582 - val_accuracy: 0.7903 Epoch 68/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4017 - accuracy: 0.8059 - val_loss: 0.4787 - val_accuracy: 0.8005 Epoch 69/150 4139/4139 [==============================] - 19s 4ms/step - loss: 0.4026 - accuracy: 0.8120 - val_loss: 0.4569 - val_accuracy: 0.7899 Epoch 70/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.3977 - accuracy: 0.8085 - val_loss: 0.4774 - val_accuracy: 0.7908 Epoch 71/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4011 - accuracy: 0.8114 - val_loss: 0.4534 - val_accuracy: 0.7928 Epoch 72/150 4139/4139 [==============================] - 19s 4ms/step - loss: 0.3983 - accuracy: 0.8100 - val_loss: 0.4745 - val_accuracy: 0.8000 Epoch 73/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4015 - accuracy: 0.8114 - val_loss: 0.4581 - val_accuracy: 0.7918 Epoch 74/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4025 - accuracy: 0.8140 - val_loss: 0.4667 - val_accuracy: 0.7899 Epoch 75/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.4036 - accuracy: 0.8086 - val_loss: 0.4606 - val_accuracy: 0.7841 Epoch 76/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.3946 - accuracy: 0.8119 - val_loss: 0.4566 - val_accuracy: 0.7923 Epoch 77/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.3984 - accuracy: 0.8118 - val_loss: 0.4669 - val_accuracy: 0.8024 Epoch 78/150 4139/4139 [==============================] - 19s 4ms/step - loss: 0.3987 - accuracy: 0.8118 - val_loss: 0.4753 - val_accuracy: 0.7884 Epoch 79/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.3976 - accuracy: 0.8123 - val_loss: 0.4696 - val_accuracy: 0.7986 Epoch 80/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.3971 - accuracy: 0.8128 - val_loss: 0.4660 - val_accuracy: 0.7899 Epoch 81/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.3997 - accuracy: 0.8113 - val_loss: 0.4617 - val_accuracy: 0.7995 Epoch 82/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.4007 - accuracy: 0.8100 - val_loss: 0.4524 - val_accuracy: 0.7966 Epoch 83/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.3924 - accuracy: 0.8161 - val_loss: 0.4778 - val_accuracy: 0.7976 Epoch 84/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.3920 - accuracy: 0.8136 - val_loss: 0.5107 - val_accuracy: 0.7966 Epoch 85/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.3994 - accuracy: 0.8157 - val_loss: 0.4620 - val_accuracy: 0.7923 Epoch 86/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.3993 - accuracy: 0.8089 - val_loss: 0.4657 - val_accuracy: 0.7908 Epoch 87/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.3929 - accuracy: 0.8148 - val_loss: 0.4649 - val_accuracy: 0.8019 Epoch 88/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.3984 - accuracy: 0.8123 - val_loss: 0.4707 - val_accuracy: 0.7947 Epoch 89/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.3939 - accuracy: 0.8165 - val_loss: 0.4627 - val_accuracy: 0.7976 Epoch 90/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.3916 - accuracy: 0.8148 - val_loss: 0.4733 - val_accuracy: 0.7961 Epoch 91/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.3977 - accuracy: 0.8134 - val_loss: 0.4663 - val_accuracy: 0.7821 Epoch 92/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.3974 - accuracy: 0.8140 - val_loss: 0.4833 - val_accuracy: 0.7913 Epoch 93/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.3917 - accuracy: 0.8158 - val_loss: 0.4947 - val_accuracy: 0.7995 Epoch 94/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.3947 - accuracy: 0.8157 - val_loss: 0.4836 - val_accuracy: 0.7928 Epoch 95/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.3974 - accuracy: 0.8109 - val_loss: 0.4750 - val_accuracy: 0.7899 Epoch 96/150 4139/4139 [==============================] - 17s 4ms/step - loss: 0.3903 - accuracy: 0.8147 - val_loss: 0.4873 - val_accuracy: 0.8034 Epoch 97/150 4139/4139 [==============================] - 24s 6ms/step - loss: 0.3961 - accuracy: 0.8170 - val_loss: 0.4743 - val_accuracy: 0.7889 Epoch 98/150 4139/4139 [==============================] - 17s 4ms/step - loss: 0.3940 - accuracy: 0.8136 - val_loss: 0.4664 - val_accuracy: 0.7913 Epoch 99/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.3971 - accuracy: 0.8144 - val_loss: 0.4689 - val_accuracy: 0.7942 Epoch 100/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.3863 - accuracy: 0.8149 - val_loss: 0.4742 - val_accuracy: 0.8034 Epoch 101/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.3973 - accuracy: 0.8138 - val_loss: 0.4677 - val_accuracy: 0.7995 Epoch 102/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.3876 - accuracy: 0.8177 - val_loss: 0.4730 - val_accuracy: 0.7932 Epoch 103/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.3942 - accuracy: 0.8171 - val_loss: 0.4546 - val_accuracy: 0.7995 Epoch 104/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.3973 - accuracy: 0.8094 - val_loss: 0.4760 - val_accuracy: 0.7952 Epoch 105/150 4139/4139 [==============================] - 17s 4ms/step - loss: 0.3959 - accuracy: 0.8107 - val_loss: 0.4696 - val_accuracy: 0.8034 Epoch 106/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.3887 - accuracy: 0.8192 - val_loss: 0.4790 - val_accuracy: 0.8005 Epoch 107/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.3982 - accuracy: 0.8126 - val_loss: 0.4753 - val_accuracy: 0.8005 Epoch 108/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.3901 - accuracy: 0.8182 - val_loss: 0.5050 - val_accuracy: 0.8024 Epoch 109/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.3906 - accuracy: 0.8147 - val_loss: 0.4932 - val_accuracy: 0.7995 Epoch 110/150 4139/4139 [==============================] - 22s 5ms/step - loss: 0.3933 - accuracy: 0.8187 - val_loss: 0.4880 - val_accuracy: 0.7976 Epoch 111/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.3890 - accuracy: 0.8157 - val_loss: 0.4825 - val_accuracy: 0.8048 Epoch 112/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.3934 - accuracy: 0.8166 - val_loss: 0.4522 - val_accuracy: 0.7976 Epoch 113/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.3927 - accuracy: 0.8173 - val_loss: 0.4651 - val_accuracy: 0.7903 Epoch 114/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.3940 - accuracy: 0.8137 - val_loss: 0.4699 - val_accuracy: 0.7976 Epoch 115/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.3881 - accuracy: 0.8178 - val_loss: 0.4644 - val_accuracy: 0.7894 Epoch 116/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.3904 - accuracy: 0.8164 - val_loss: 0.5028 - val_accuracy: 0.7899 Epoch 117/150 4139/4139 [==============================] - 17s 4ms/step - loss: 0.3874 - accuracy: 0.8219 - val_loss: 0.4763 - val_accuracy: 0.8000 Epoch 118/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.3937 - accuracy: 0.8177 - val_loss: 0.4695 - val_accuracy: 0.7971 Epoch 119/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.3805 - accuracy: 0.8230 - val_loss: 0.4868 - val_accuracy: 0.8053 Epoch 120/150 4139/4139 [==============================] - 19s 4ms/step - loss: 0.3941 - accuracy: 0.8167 - val_loss: 0.4712 - val_accuracy: 0.7971 Epoch 121/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.3838 - accuracy: 0.8221 - val_loss: 0.4761 - val_accuracy: 0.8005 Epoch 122/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.3895 - accuracy: 0.8204 - val_loss: 0.4834 - val_accuracy: 0.8029 Epoch 123/150 4139/4139 [==============================] - 17s 4ms/step - loss: 0.3903 - accuracy: 0.8154 - val_loss: 0.4779 - val_accuracy: 0.8005 Epoch 124/150 4139/4139 [==============================] - 17s 4ms/step - loss: 0.3876 - accuracy: 0.8211 - val_loss: 0.4721 - val_accuracy: 0.8019 Epoch 125/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.3849 - accuracy: 0.8209 - val_loss: 0.4827 - val_accuracy: 0.7976 Epoch 126/150 4139/4139 [==============================] - 17s 4ms/step - loss: 0.3942 - accuracy: 0.8172 - val_loss: 0.4762 - val_accuracy: 0.7889 Epoch 127/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.3873 - accuracy: 0.8193 - val_loss: 0.5000 - val_accuracy: 0.7971 Epoch 128/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.3882 - accuracy: 0.8184 - val_loss: 0.4710 - val_accuracy: 0.8014 Epoch 129/150 4139/4139 [==============================] - 19s 5ms/step - loss: 0.3845 - accuracy: 0.8196 - val_loss: 0.4926 - val_accuracy: 0.8019 Epoch 130/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.3850 - accuracy: 0.8202 - val_loss: 0.5442 - val_accuracy: 0.7976 Epoch 131/150 4139/4139 [==============================] - 17s 4ms/step - loss: 0.3844 - accuracy: 0.8187 - val_loss: 0.5249 - val_accuracy: 0.7937 Epoch 132/150 4139/4139 [==============================] - 17s 4ms/step - loss: 0.3854 - accuracy: 0.8187 - val_loss: 0.5153 - val_accuracy: 0.7961 Epoch 133/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.3885 - accuracy: 0.8170 - val_loss: 0.4789 - val_accuracy: 0.7981 Epoch 134/150 4139/4139 [==============================] - 17s 4ms/step - loss: 0.3820 - accuracy: 0.8193 - val_loss: 0.4811 - val_accuracy: 0.7932 Epoch 135/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.3888 - accuracy: 0.8161 - val_loss: 0.4679 - val_accuracy: 0.7860 Epoch 136/150 4139/4139 [==============================] - 18s 4ms/step - loss: 0.3838 - accuracy: 0.8161 - val_loss: 0.5319 - val_accuracy: 0.7899 Epoch 137/150 4139/4139 [==============================] - 17s 4ms/step - loss: 0.3883 - accuracy: 0.8166 - val_loss: 0.5034 - val_accuracy: 0.7990 Epoch 138/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.3859 - accuracy: 0.8178 - val_loss: 0.4971 - val_accuracy: 0.8000 Epoch 139/150 4134/4139 [============================>.] - ETA: 0s - loss: 0.3864 - accuracy: 0.8175Restoring model weights from the end of the best epoch: 119. 4139/4139 [==============================] - 17s 4ms/step - loss: 0.3866 - accuracy: 0.8173 - val_loss: 0.4838 - val_accuracy: 0.7976 Epoch 139: early stopping
ANN_model_3.evaluate(x_test_3, y_test_3)
65/65 [==============================] - 0s 3ms/step - loss: 0.4868 - accuracy: 0.8053
[0.4867652654647827, 0.8053140044212341]
fig_5 = go.Figure()
fig_5.add_trace(go.Scatter(x =np.arange(0,len(model_history_3.history['accuracy'])),
y = model_history_3.history['val_accuracy'],
mode='lines+markers',
name='val_accuracy'))
fig_5.add_trace(go.Scatter(x =np.arange(0,len(model_history_3.history['accuracy'])),
y = model_history_3.history['accuracy'],
mode='lines+markers',
name='Accuracy'))
fig_5.update_layout(title = 'ACCURACY vs VALIDATION_ACCURACY')
fig_5.update_xaxes(title_text="Epochs")
fig_5.update_yaxes(title_text="Accuracy")
fig_5.show()
fig_6 = go.Figure()
fig_6.add_trace(go.Scatter(x =np.arange(0,len(model_history_3.history['loss'])),
y = model_history_3.history['loss'],
mode='lines+markers',
name='loss'))
fig_6.add_trace(go.Scatter(x =np.arange(0,len(model_history_3.history['loss'])),
y = model_history_3.history['val_loss'],
mode='lines+markers',
name='val_loss'))
fig_6.update_layout(title = 'LOSS vs VALIDATION_LOSS')
fig_6.update_xaxes(title_text="Epochs")
fig_6.update_yaxes(title_text="Loss")
fig_6.show()
predict_3 = ANN_model_3.predict(x_test_3)
predict_new_3 = []
for x in predict_3:
if x >= 0.5:
predict_new_3.append(1)
else:
predict_new_3.append(0)
65/65 [==============================] - 0s 2ms/step
plx.imshow(confusion_matrix( y_test_3, predict_new_3), text_auto = True)
print(classification_report(y_test_3, predict_new_3))
precision recall f1-score support
0.0 0.89 0.70 0.78 1033
1.0 0.75 0.91 0.82 1037
accuracy 0.81 2070
macro avg 0.82 0.81 0.80 2070
weighted avg 0.82 0.81 0.80 2070
x_train_3_new = x_train_3.drop(to_remove_features, axis = 1)
x_test_3_new = x_test_3.drop(to_remove_features, axis = 1)
ANN_model_4 = Sequential()
# Adding Input Layer to ANN
ANN_model_4.add(Dense(units = 9, activation = 'relu'))
# Adding 1st Hidden Layer to the ANN
ANN_model_4.add(Dense(units = 7, activation = 'relu'))
ANN_model_4.add(Dropout(0.3))
# Adding 2nd Hidden Layer to the ANN
ANN_model_4.add(Dense(units = 3, activation = 'relu'))
ANN_model_4.add(Dropout(0.3))
# Adding Output Layer to the ANN
ANN_model_4.add(Dense(units = 1, activation = 'sigmoid'))
ANN_model_4.compile(optimizer = 'adam',
loss = 'binary_crossentropy',
metrics = ['accuracy'])
early_stopping = tf.keras.callbacks.EarlyStopping(
monitor="accuracy",
min_delta=0.0001,
patience=20,
verbose=1,
mode="auto",
baseline=None,
restore_best_weights=True
)
model_history_4 = ANN_model_4.fit(x_train_3_new, y_train_3, batch_size = 2, epochs = 150, validation_data = (x_test_3_new, y_test_3), callbacks = early_stopping )
Epoch 1/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.6040 - accuracy: 0.6622 - val_loss: 0.5274 - val_accuracy: 0.7594 Epoch 2/150 4139/4139 [==============================] - 17s 4ms/step - loss: 0.5670 - accuracy: 0.7050 - val_loss: 0.5117 - val_accuracy: 0.7609 Epoch 3/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5690 - accuracy: 0.7021 - val_loss: 0.5062 - val_accuracy: 0.7647 Epoch 4/150 4139/4139 [==============================] - 17s 4ms/step - loss: 0.5616 - accuracy: 0.7114 - val_loss: 0.4981 - val_accuracy: 0.7623 Epoch 5/150 4139/4139 [==============================] - 17s 4ms/step - loss: 0.5528 - accuracy: 0.7104 - val_loss: 0.4930 - val_accuracy: 0.7609 Epoch 6/150 4139/4139 [==============================] - 17s 4ms/step - loss: 0.5539 - accuracy: 0.7125 - val_loss: 0.5025 - val_accuracy: 0.7604 Epoch 7/150 4139/4139 [==============================] - 17s 4ms/step - loss: 0.5501 - accuracy: 0.7165 - val_loss: 0.4955 - val_accuracy: 0.7657 Epoch 8/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5504 - accuracy: 0.7108 - val_loss: 0.4920 - val_accuracy: 0.7686 Epoch 9/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5517 - accuracy: 0.7119 - val_loss: 0.4947 - val_accuracy: 0.7691 Epoch 10/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5523 - accuracy: 0.7113 - val_loss: 0.4922 - val_accuracy: 0.7696 Epoch 11/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5517 - accuracy: 0.7171 - val_loss: 0.4958 - val_accuracy: 0.7681 Epoch 12/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5506 - accuracy: 0.7167 - val_loss: 0.4878 - val_accuracy: 0.7676 Epoch 13/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5538 - accuracy: 0.7144 - val_loss: 0.5013 - val_accuracy: 0.7729 Epoch 14/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5467 - accuracy: 0.7178 - val_loss: 0.4856 - val_accuracy: 0.7599 Epoch 15/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5473 - accuracy: 0.7195 - val_loss: 0.4957 - val_accuracy: 0.7662 Epoch 16/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5459 - accuracy: 0.7148 - val_loss: 0.5051 - val_accuracy: 0.7758 Epoch 17/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5505 - accuracy: 0.7138 - val_loss: 0.4956 - val_accuracy: 0.7705 Epoch 18/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5475 - accuracy: 0.7138 - val_loss: 0.4894 - val_accuracy: 0.7652 Epoch 19/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5450 - accuracy: 0.7187 - val_loss: 0.4949 - val_accuracy: 0.7647 Epoch 20/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5496 - accuracy: 0.7142 - val_loss: 0.4929 - val_accuracy: 0.7643 Epoch 21/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5413 - accuracy: 0.7180 - val_loss: 0.4989 - val_accuracy: 0.7734 Epoch 22/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5471 - accuracy: 0.7150 - val_loss: 0.4961 - val_accuracy: 0.7725 Epoch 23/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5427 - accuracy: 0.7228 - val_loss: 0.4950 - val_accuracy: 0.7686 Epoch 24/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5408 - accuracy: 0.7222 - val_loss: 0.4952 - val_accuracy: 0.7657 Epoch 25/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5468 - accuracy: 0.7196 - val_loss: 0.4937 - val_accuracy: 0.7725 Epoch 26/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5452 - accuracy: 0.7174 - val_loss: 0.4961 - val_accuracy: 0.7681 Epoch 27/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5446 - accuracy: 0.7162 - val_loss: 0.4947 - val_accuracy: 0.7749 Epoch 28/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5426 - accuracy: 0.7166 - val_loss: 0.4976 - val_accuracy: 0.7725 Epoch 29/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5434 - accuracy: 0.7144 - val_loss: 0.5031 - val_accuracy: 0.7686 Epoch 30/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5423 - accuracy: 0.7167 - val_loss: 0.5033 - val_accuracy: 0.7778 Epoch 31/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5451 - accuracy: 0.7162 - val_loss: 0.4872 - val_accuracy: 0.7729 Epoch 32/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5472 - accuracy: 0.7125 - val_loss: 0.4969 - val_accuracy: 0.7691 Epoch 33/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5419 - accuracy: 0.7193 - val_loss: 0.4846 - val_accuracy: 0.7734 Epoch 34/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5455 - accuracy: 0.7150 - val_loss: 0.4932 - val_accuracy: 0.7681 Epoch 35/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5376 - accuracy: 0.7195 - val_loss: 0.4957 - val_accuracy: 0.7710 Epoch 36/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5479 - accuracy: 0.7139 - val_loss: 0.5128 - val_accuracy: 0.7662 Epoch 37/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5440 - accuracy: 0.7232 - val_loss: 0.4970 - val_accuracy: 0.7691 Epoch 38/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5457 - accuracy: 0.7172 - val_loss: 0.4958 - val_accuracy: 0.7700 Epoch 39/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5422 - accuracy: 0.7188 - val_loss: 0.5018 - val_accuracy: 0.7749 Epoch 40/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5431 - accuracy: 0.7153 - val_loss: 0.5043 - val_accuracy: 0.7705 Epoch 41/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5431 - accuracy: 0.7122 - val_loss: 0.4928 - val_accuracy: 0.7754 Epoch 42/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5387 - accuracy: 0.7270 - val_loss: 0.5048 - val_accuracy: 0.7681 Epoch 43/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5376 - accuracy: 0.7184 - val_loss: 0.4964 - val_accuracy: 0.7671 Epoch 44/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5426 - accuracy: 0.7185 - val_loss: 0.4961 - val_accuracy: 0.7710 Epoch 45/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5446 - accuracy: 0.7193 - val_loss: 0.5010 - val_accuracy: 0.7686 Epoch 46/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5462 - accuracy: 0.7183 - val_loss: 0.4978 - val_accuracy: 0.7778 Epoch 47/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5435 - accuracy: 0.7171 - val_loss: 0.4882 - val_accuracy: 0.7754 Epoch 48/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5409 - accuracy: 0.7235 - val_loss: 0.5009 - val_accuracy: 0.7773 Epoch 49/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5427 - accuracy: 0.7235 - val_loss: 0.4938 - val_accuracy: 0.7720 Epoch 50/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5446 - accuracy: 0.7168 - val_loss: 0.4988 - val_accuracy: 0.7773 Epoch 51/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5434 - accuracy: 0.7201 - val_loss: 0.4999 - val_accuracy: 0.7691 Epoch 52/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5478 - accuracy: 0.7222 - val_loss: 0.5020 - val_accuracy: 0.7729 Epoch 53/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5429 - accuracy: 0.7184 - val_loss: 0.4853 - val_accuracy: 0.7758 Epoch 54/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5409 - accuracy: 0.7183 - val_loss: 0.4887 - val_accuracy: 0.7773 Epoch 55/150 4139/4139 [==============================] - 16s 4ms/step - loss: 0.5436 - accuracy: 0.7214 - val_loss: 0.4972 - val_accuracy: 0.7725 Epoch 56/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5425 - accuracy: 0.7161 - val_loss: 0.4982 - val_accuracy: 0.7749 Epoch 57/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5413 - accuracy: 0.7225 - val_loss: 0.4922 - val_accuracy: 0.7734 Epoch 58/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5429 - accuracy: 0.7203 - val_loss: 0.4894 - val_accuracy: 0.7758 Epoch 59/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5355 - accuracy: 0.7257 - val_loss: 0.4955 - val_accuracy: 0.7696 Epoch 60/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5441 - accuracy: 0.7199 - val_loss: 0.5041 - val_accuracy: 0.7787 Epoch 61/150 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5337 - accuracy: 0.7241 - val_loss: 0.4973 - val_accuracy: 0.7700 Epoch 62/150 4122/4139 [============================>.] - ETA: 0s - loss: 0.5386 - accuracy: 0.7261Restoring model weights from the end of the best epoch: 42. 4139/4139 [==============================] - 15s 4ms/step - loss: 0.5387 - accuracy: 0.7257 - val_loss: 0.4959 - val_accuracy: 0.7628 Epoch 62: early stopping
fig_7 = go.Figure()
fig_7.add_trace(go.Scatter(x =np.arange(0,len(model_history_4.history['accuracy'])),
y = model_history_4.history['val_accuracy'],
mode='lines+markers',
name='val_accuracy'))
fig_7.add_trace(go.Scatter(x =np.arange(0,len(model_history_4.history['accuracy'])),
y = model_history_4.history['accuracy'],
mode='lines+markers',
name='Accuracy'))
fig_7.update_layout(title = 'ACCURACY vs VALIDATION_ACCURACY')
fig_7.update_xaxes(title_text="Epochs")
fig_7.update_yaxes(title_text="Accuracy")
fig_7.show()
fig_8 = go.Figure()
fig_8.add_trace(go.Scatter(x =np.arange(0,len(model_history_4.history['loss'])),
y = model_history_4.history['loss'],
mode='lines+markers',
name='loss'))
fig_8.add_trace(go.Scatter(x =np.arange(0,len(model_history_4.history['loss'])),
y = model_history_4.history['val_loss'],
mode='lines+markers',
name='val_loss'))
fig_8.update_layout(title = 'LOSS vs VALIDATION_LOSS')
fig_8.update_xaxes(title_text="Epochs")
fig_8.update_yaxes(title_text="Loss")
fig_8.show()
predict_4 = ANN_model_4.predict(x_test_3_new)
predict_new_4 = []
for x in predict_4:
if x >= 0.5:
predict_new_4.append(1)
else:
predict_new_4.append(0)
65/65 [==============================] - 0s 1ms/step
plx.imshow(confusion_matrix(y_test_3, predict_new_4), text_auto = True)
print(classification_report(y_test_3, predict_new_4))
precision recall f1-score support
0.0 0.76 0.79 0.77 1033
1.0 0.78 0.74 0.76 1037
accuracy 0.77 2070
macro avg 0.77 0.77 0.77 2070
weighted avg 0.77 0.77 0.77 2070
import pickle as pkl
with open("Telco_customer_churn_prediction_model.pkl", "wb") as f:
pkl.dump(ANN_model_3, f)
INFO:tensorflow:Assets written to: ram://783be2b1-7c12-4fdb-b82e-2e8b551590e0/assets